THE 


PHYSICAL REVIEW 


CA journal of experimental and theoretical physics established by E. L. Nichols in 1893 





Seconp Segrigs, Vor. 97, No. 6 


MARCH 15, 1955 





Relation Between Canonical and Microcanonical Ensembles* 


MEtvin Lax 
Department of Physics, Syracuse University, Syracuse, New York 


(Received July 20, 1954) 


The equivalence of averages calculated in canonical and microcanonical ensembles is shown to depend 
on the validity of a steepest-descent approximation. It is demonstrated that the microcanonical and canonical 
procedures yield different values for the order parameter below the Curie temperature for spherical model 


dipole lattices. 





E propose to show that averages calculated in 
the microcanonical! ensemble reduce to corre- 
sponding values obtained in.the canonical ensemble 
providing the microcanonical calculation can be evalu- 
ated by a steepest descent method. When this condition 
is not satisfied, we shall demonstrate by means of an 
example—the spherical model of a ferromagnet?* that 
the canonical ensemble can yield incorrect results. 
This note was stimulated by the astute observation of 
Lewis and Wannier‘ that the integration in the complex 
plane in the Berlin-Kac spherical model calculation 
can apparently be avoided by using a canonical treat- 
ment of the spherical constraint. We say apparently 
because Lewis and Wannier® have since then discovered 
a discrepancy between their canonical treatment and 
the corresponding microcanonical treatment of Berlin 
and Kac in evaluating a fluctuation in the spherical 
constraint. 
Consider a phase or configuration space described 
by the set of variables e= {¢1, €2, ---ev} and an exten- 
sive phase function ¢(e). By extensive we mean that the 


*Supported in part by the Office of Naval Research and in 
part by the U. S. Air Force (Air Research and Development 
Command). 

1 An ensemble is said here to be microcanonical or canonical 
with respect to a given extensive variable according to whether that 
variable is constrained to take a fixed value, or merely required to 
have a given average value. (The usual classification is based on 
whether the energy is constrained or not. Ensembles in which 
variables in addition to the energy are canonical are usually re- 
ferred to as grand canonical.) 

? T. H. Berlin and M. Kac, Phys. Rev. 86, 821 (1952). 

: Lax, J. Chem. Phys. 20, 1351 (1952), hereafter referred to 
as 


‘i. W. Lewis and G. H. Wannier, Phys. Rev. 88, 682 (1952). 
(1958) W. Lewis and G. H. Wannier, Phys. Rev. 90, 1131(E) 


value of (€) is proportional to N, the size of the system 
An ensemble canonical in the energy U(e) and in ¢(e) 
has the partition function: 


QW= f exp[—BU (e)—16(6) de, (1) 


where B= (kT)—", and / is the variable conjugate to ¢(e). 
The condition that ¢(¢) possess the mean value KN 


leads to 
(o(€)) = — (0/02) (InQ)= KN. (2) 


The mean value of any other observable H(e) is given 
by H(t,), where 


A()=[0} f H(6) expl—BU()—14(6) Me, (3) 
and /, is the value of ¢ determined by (2). 


The corresponding partition function in an ensemble 
microcanonical with respect to ¢ is 


Q= f exp(—BU)deb(KN'—@). (4) 


Using the usual integral representation for a delta 
function 


5(KN—¢) = (277) f dtexp[t(KN—¢)] (5) 


we can express (Q in terms of the canonical partition 
function 


Q= (277) f 7 expL V Kt+1nQ (2) |dt. (6) 


—too 


1419 





1420 


Thus the microcanonical partition function is a weighted 
superposition of canonical partition functions over 
various values of the intensive parameter ¢. If, however, 
¢(€) is a suitably chosen (i.e., “macroscopic”’) variable, 
we may expect from (1) that for large NV, Q(é) has the 
form [Z(#)]*. The exponent in (6) then possesses only 
terms of order NV and a steepest descent evaluation is 
permissible. The saddle point ‘=/, may be located by 
setting the derivative of the exponent in (6) equal to 
zero, which leads identically to Eq. (2)! Thus the major 
contribution to Q comes from canonical ensembles in a 
small interval around /,, the canonical value of the 
intensive variable 1. 

The mean value of H(e) in the microcanonical 
ensemble can be shown by a similar argument to be: 


(H(€)= (2miQ) f H(i) explWKt+InQ(1) dt, (7) 


or 


(H(€))~H (t,). (8) 


Passage from (7) to (8) however requires (a) the exist- 
ence of a saddle point which imposes requirements on 
the nature of ¢(e), (b) that InH(é) is not of order N, 
so that the saddle condition in (7) does not differ from 
that for (6) and (c) that H(#) does not possess a singu- 
larity in the immediate neighborhood of #,. Conditions 
(b) and (c) impose requirements in the nature of H(e). 
All of these conditions must be met before (8) is valid, 
i.e., before the canonical and microcanonical ensembles 
yield the same mean value for H(e). 

It may now be of interest to illustrate these remarks 
by applying them to the spherical model of a dipole 
lattice, for which ¢(e)=K) e;7, where K=np?/(2kT). 
Lewis and Wannier found that the mean of the variable 
H(€)= de; is not given correctly by the canonical ap- 
proach. They point out, however, that the sum ve,‘ 
determines the fluctuations in the constraint ¢(¢) and 
state that “discrepancies are particularly apt to occur 
in those averages which are connected with fluctuations 
in the assumed constraints.” They say however that 
the canonical method is adequate for the derivation of 
“all thermodynamic properties.” 

We shall show that the last remark is not always 
valid by demonstrating that the order parameter (a 
thermodynamic variable proportional to the magnetiza- 
tion in the ferromagnetic case) and its moments are not 
given correctly by the canonical approach. According 
to I(7.4) the order parameter can be defined as 


S(Q)=N4 | yn| = [Le] /N, (9) 


where plus signs are to be used in the ferromagnetic 
case, and alternating signs if the mode y,, with the 
largest eigenvalue \,, (i.e., lowest energy) corresponds 
to antiferromagnetic order. The mean value of [S(e) ]* 
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in the canonical ensemble using a slight modification of 
(7.5) is given by 


Sn()=[NK (t—Ym) FP (n+ 1)/2)/T (1/2). (10) 
After the saddle value for ¢ is inserted,* (10) becomes 
S(t) =[1— (T/T) "2° ((n+1)/2)/T (1/2). (11) 


It is clear from (10) that no choice of ¢, will make 
Sn(ts)=[Si(¢.)]". This condition should be obeyed, 
however, in the limit N— since the order parameter 
Si is a macroscopic thermodynamic variable below 
the Curie temperature (i.e., the relative fluctuations in 
S; are of order N-#). 

The above discrepancy arises because of the singu- 
larity in S,(t) at ‘=m. Below the Curie temperature, 
the saddle point® , differs from \,, only by terms of 
order (1/N). Thus condition (c) is violated and the 
saddle point or canonical method is invalid. With the 
help of (6), (7), 1(7.1), and I(7.6) the correct mth mo- 
ment is given by 


f Sa(0)(t—Am)-} exp[ VK (1—T/T,)(t—Aw) dt 
Sano 
f (t—Am)~? expLN K (1— T/T.) (t—Am) ]dt 
c 
(12) 


where the contour C extends from — © to Am below the 
real axis, counterclockwise around A», and back to —* 
above the real axis. With the help of the Hankel integral 
formula,’ we obtain 


Sr= [1— (T/T.) |", (13) 


the correct result for the spherical model. 

The spherical model need not be an adequate repre- 
sentation of a dipole lattice. But it forms a perfectly 
valid example for comparing the canonical and micro- 
canonical procedures. We may conclude that the 
complex integrations required by the microcanonical 
procedure can only be avoided (using a canonical 
procedure) when these integrations are easy to perform 
by a saddle point method. 

There is no guarantee, furthermore, that the canonical 
procedure is valid for all thermodynamic properties. 
However, once the canonical value H(/) has been ob- 
tained for a given thermodynamic variable, it should be 
easy to verify whether conditions (b) and (c) for the 
validity of the canonical method are satisfied for the 
variable H. 





6 For the saddle condition see for example Eqs. (2.12) and (2.18) 
of M. Lax, Phys. Rev. 97, 629 (1955). 

7E. T. Copson, Functions of a Complex Variable (Oxford 
University Press, London, 1935), Sec. 9.6, p. 225. 
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It is shown that it is possible to make a transition from the vector wave equation to ray optics. However, 
because of the possibility of polarization of the waves, it is no longer true that the rays will be normal to a 
given wave surface, (x) =constant. In place of the usual curl-free velocity field that is defined by a scalar 
wave equation we find that the new velocity field depends upon parameters that express its vorticity. The 
expression for the velocity, and thus the direction of the rays, agrees in magnitude and direction with the 
time-averaged Poynting vector, except for a term that is the curl of a vector, and which can be interpreted as 


the intrinsic vorticity of the medium. 


The methods employed in relating the motion of electromagnetic waves to the motion of a field of particle- 
like trajectories are equally applicable to other wave fields that possess internal degrees of freedom. The 
Pauli and Dirac theories of the electron make use of wave fields of this type. However, in this paper, no 
attempt is made to give such an interpretation to these theories. 





I. INTRODUCTION 


T is well known that in the absence of charged 

particles and currents the electromagnetic field may 

be represented by a vector potential A, which satisfies 
the wave equation 


(1) 


and the gauge condition 
divA=0. (2) 


e and w are the dielectric constant and magnetic 
permeability, respectively, and are here taken as known 
constants for a given medium. 

It is customary to prove that ray optics is a con- 
sequence of these equations, in the short-wavelength 
limit, by a method due to Debye.' One assumes that 
for a single component of A, there exists a solution of 
the wave equation of the form 


u= R(x) cosLke(x) —wt ]}, (3) 


where R(x) is the amplitude of the wave and ¢(zx) is 
called the eiconal, and both are functions of position. 
Substituting this value of « in the wave equation one 
finds that there is a separation into two types of terms, 
one multiplied by cos(ky—wt) and the other by 
sin(kg—wt). If the equation is to hold for all values of 
the time, then the coefficients of both the cosine and 
sine terms must vanish separately. As a consequence 
we derive the following two independent equations: 


V-(R°V¢)=0, (4) 
RP (eu—Ve-Ve)R+V°R=0, (w/c=h). (5) 


Ps at Stevens Institute of Technology, Hoboken, New 
ersey. 

1See A. Sommerfeld and J. Runge, Ann. Physik 35, 290 (1911). 
Actually, the method of Debye which is given below has been 
considerably improved and one can secure a transition to ray 
optics directly from the vector wave equation. In this proof, 
however, although the amplitudes of each wave component differ 
from one another, the phase dependence is the same. 


The first has the form of a conservation law in a fluid, 
where the particle velocity is given by gradg and the 
time-independent density of particles by R?. The second 
of the above equations, in the limit of short wavelengths 
(k—), takes the form of the eiconal equation: 


Vo: Vo=en. (6) 


The surfaces, y(x)=constant, satisfying Eq. (6), are 
the surfaces of constant phase at any instant of time 
for the solutions w. The normals to these surfaces are 
given by gradg and these normals coincide with the 
direction of the rays at each point of space. The rays 
will deviate from straight lines when the dielectric 
constant or magnetic permeability are functions of 
position. 

Equations (4) and (5) could just as well have been 
obtained if we had chosen instead of the real solution 
for the wave equation, the complex solution 


u= Reiko, (7) 


where R and ¢ are again real functions of the space 
coordinates. After substitution in the wave equation 
we would obtain a complex expression that vanished, 
and whose real and imaginary parts would give Eqs. (4) 
and (5). 

However the problem of finding a transition to ray 
optics for a general vector A satisfying (1) and (2) 
cannot be solved in the simple way shown above. We 
can see this most easily if we write A in the form 


R,; cos(kgi— wt) 
A= (2: cos(es—at)), (8) 
R; cos(ky3— wt) 


where R, Re, R3 and ¢1, ¢2, ¢3 are in general different 
functions of the spatial coordinates. The obvious ques- 
tion arises: which one of the three separate phase 
functions will serve as our eiconal? There cannot be 
three separate eiconals for three distinct rays, since 
the vector A expresses the motion of a single wave with 
varying amplitude and phase at different points of 
space. 
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In the following we shall show that from Eggs. (1) 
and (2) one can define a new type of ray trajectory, 
which no longer has the property of being normal to a 
single wave surface, g(x)=constant. This condition of 
the normality of the rays to a single wave surface is 
equivalent to the fact that the velocity field (the field 
of ray trajectories) is curl-free. We shall show however 
that our newly defined velocity field contains terms 
which express the vorticity of the “ray fluid.” In a 
number of cases of practical interest the vortex motion 
is absent from the fluid and the rays follow paths which 
are normal at each point in space to a single wave 
surface, y(x)= constant. 

This restriction to the curl-free case, and further to 
the case of circularly polarized light, has been made the 
basis of a new representation of the electromagnetic 
field by Green and Wolf.? The point of view which 
these authors have adopted is similar to ours, and 
indeed their results prove to be a special case of ours 
for circularly polarized light. For more general types 
of waves we find that it is necessary to consider the 
vorticity of the ray fluid. 

As a result of our assumed dependence of the vector 
potential on three distinct amplitudes, as well as three 
distinct phase factors (only four of these functions are 
independent because of the gauge condition), the vector 
wave equation is replaced by six equivalent equations, 
in much the same manner that the two equations (4) 
and (6) replaced the scalar wave equation. A particular 
combination of these six equations can be chosen so 
that of the new group of six relations one has the form 
of a conservation law, and thus may be used to define 
a velocity field describing the “flow of photons.” 
A second of these new equations reduces to the eiconal 
equation, (6), in the absence of vortex motion, and can 
thus be considered as the generalized eiconal equation. 
The four other equations describe the changes along 
the stream lines of the fluid in the four parameters that 
express the vorticity of the medium. 

Finally the question may be asked whether the ray 
trajectories that are defined from the conservation law 
are the same as those given by the time-averaged 
Poynting’s vector, which describes, on the average, 
the direction of energy flow of the electromagnetic field. 
For particular cases we shall see that our vector field 
coincides with the velocity field defined by Poynting’s 
vector. In general however, the current density that we 
are going to define with the aid of Eqs. (1) and (2) 
differs from the time-averaged Poynting’s vector by 
the curl of a certain vector. This vector, when integrated 
over the volume of a beam of light, is equal to the total 
angular momentum of the beam. For the quantized 
electromagnetic field the vector is sometimes called the 
intrinsic spin angular momentum of the photon. 


2H. S. Green and E. Wolf, Proc. Phys. Soc. (London) A66, 
1129 (1953). 


“ RALPH SCHILLER 


II. NEW DEFINITION OF THE VELOCITY FIELD 


In this section, we shall show that from the vector 
wave equation, Eq. (1), one may define a more general 
velocity field than that derived from the scalar wave 
equation. This new velocity field gives the magnitude 
and direction of the ray trajectories at each point of 
space, although, unlike the scalar theory, the trajec- 
tories are not normal to any single wave surface. 

In order to define this general velocity field we may 
if we wish substitute the vector A in Eq. (8) in the 
wave equation and then show that a particular com- 
bination of the six resulting equations leads to a con- 
servation law. This conservation law can then be used 
to define the velocity field. However, we shall use a 
simpler and more direct method to define our velocity 
field. 

We assume that the vector potential A is a complex 
vector with the time dependence e~*“‘, and whose real 
and imaginary parts satisfy the wave equation. We may 
then write the wave equations for A and its complex 
conjugate A*: 


(ik/32mp) (V2A*+ Rev A*) =0, 
(ik/32mp) (V?A+ k*euA) =0. (9) 


(We have multiplied the wave equation by a constant 
coefficient so that the velocity vector will have the 
same coefficient as the momentum density of the electro- 
magnetic field.) 

If we take the scalar product of the first equation 
with A and the second with A*, and subtract the two 
equations, we find that we may interpret the vector J, 


J= (tk/16mp)[(VA*)- A— (VA) - A*], (10) 


as a current density since it satisfies the conservation 
law 


(11) 


The complex representation of the vector A has not 
changed the actual contents of the theory in any way, 
for by choosing A complex, we have been able to derive 
quickly the same conservation equation that is given 
by a combination of the six equations that result from 
substituting the real vector A, Eq. (8), in the wave 
equation. 

If the complex vector A, is reduced to one of its 
components, which we shall call «, the other two com- 
ponents vanishing, the density becomes 


J= (ik/16mp) (uVu*¥ —u*Vu). (12) 


The velocity field, which is defined as v= J/u*u and 
gives the direction of the rays, can then be written as 
the gradient of a scalar function, 


v= J/u*u= (ik/16ru)V log(u*/u). 


If we should write w in the form u=Re*?, we get 
exactly the velocity vector given by Eq. (4), except 
for a constant coefficient. This confirms the fact that 


V-J=0. 


(13) 





NEW TRANSITION TO RAY OPTICS 


the complex representation of the vector potential may 
be used to attain the same results as are usually secured 
from the real representation. In the scalar theory there 
isno reason to favor one over the other, but in the vector 
wave theory, there is a decided advantage in using the 
complex representation because of the simplification of 
the calculations. 

The fact that the velocity field may be written as 
the gradient of a scalar function is of course not true 
in general, because the vector character of A will 
preclude the possibility of writing the velocity as a 
curl-free field. We can, however, employ a definition of 
the velocity similar to the one given in the scalar case, 


v= J/A*-A, (14) 


and then show that it can be rewritten as the gradient 
of a scalar, plus additional terms that represent the 
vortex motion of the stream lines of the ray fluid*: 


v= (tk/16mu) (Veit &2V ost ésV ¢3), (15) 


where 
p&e = A aA ztA A oo. A ‘A 2) 


pts= A,*A,—A,*A,+A,*A,, 

p=A,*A,+A4,*A,+A,*A,, 
—4i(kgi—wt) =log(A,*A,*/A,A,), 
—4ikgo=log(A.*A,/A2A,*), 
—4iky;=log(A,*A,/A-zA,*). 


The four functions £2, £3, g2, and y3 may be called 
the Clebsch parameters in analogy to the parameters 
that were introduced by Clebsch in the middle of the 
past century for the decomposition of the velocity 
field of a real fluid. Clebsch showed that the fluid 
velocity may be written 


v'=VotéVn. 
The functions ¢ and 7 have since been called the Clebsch 


*The particular way we have chosen to represent the velocity 
field, as shown below in Eq. (14), is of course not unique. Actually, 
there are an infinite number of ways in which the decomposition 
of the velocity field can be made. Each decomposition can be 
transformed into any other by means of a group of transformations 
which leave the velocity vector invariant. This group of trans- 
formations is known as the Clebsch transformations. If the 
general velocity field is given by v=V¢it£2V ¢2+£:V ¢3, then the 
infinitesimal Clebsch transformations are given by 


bp1=F—&20F /df2—§:0F/0ts, Sg2=0F/dt:, 5y93=9F /dés, 
df= —OF/dg2, 5&=—AF/d¢;, 


where F, the generator of the transformation, is an arbitrary 
lunction of £2, £3, g2, v3. A fact that is somewhat troublesome is 
that we have represented the velocity field by five variables, 
whereas three should be sufficient. It should be possible to trans- 
orm Eq. (14) to the so-called canonical form, so that the vector v 
may be written as a function of three variables, v=V.S+£Vp. 
However, the decomposition of the velocity field as shown in 
Eq. (14) may have some advantages over the canonical form in 
that the variables that appear in (14) may have a simple physical 
interpretation. J. Tiomno has given an interesting interpretation 
of the variables that appear in the complex vector representing 
spin-one particles. 

‘Quoted in H. Lamb, Hydrodynamics (Dover Publications, Inc., 
New York, 1945), sixth edition, p. 248, 
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parameters. In the case of real fluids these parameters 
offer a simple interpretation, for the vortex lines lie 
along the intersection of the two surfaces £=const and 
n=const, as is easily seen from the equation for the 
vorticity ¢’, 

v=VXv'=VEX Vn. 


This simple interpretation fails in our representation 
of the electromagnetic fluid, for the vorticity takes the 


form 
€=curlv= Vie X Voot Vis X Vos, (17) 


and this vector lies neither in the intersection formed 
by the surfaces £=const, g2=const, nor in the inter- 
section of §s= const, ¢3=const, but rather in a direction 
given by the vector sum of the lines formed in the 
crossing of the two sets of surfaces. 

In many cases it is possible to show that the Clebsch 
parameters vanish and that the velocity remains the 
gradient of a scalar function. A simple example is 
elliptically polarized light in free space. For a wave 
traveling in the direction of the z-axis, the components 
of the vector potential are 


A c= Ceite-ef) 
A y= Deite-ottt5) | 


A,=0, 


(18) 


where C, D, and 6 are constants. The velocity vector is 
equal to 

v= (tkw/32ry)V log(Az*A,*/AzAy), 

0,= k’w/8rp, 


which proves that for elliptically polarized light the 
rays are normal to a single wave surface. This result 
may fail when C, D, and 6 become functions of the 
coordinates.® 

In general, vortex motion will be present in the 
electromagnetic fluid since the terms containing the 
Clebsch parameters in the velocity field do not always 
vanish. For reasons that will become apparent in 
Sec. IV, we shall use the term body vorticity to de- 
scribe these vortices arising from the Clebsch param- 
eters in the velocity field. 

We can give a simple example to illustrate the 
presence of body vorticity in the ray fluid. 

In a perfectly conducting rectangular wave guide 
whose width in the x-direction is a, and in the y-direc- 


ty=0, 2=0, (19) 


5 The failure of circularly or elliptically polarized light of 
infinite extension to exhibit vorticity is related to the fact that 
the same radiation does not possess angular momentum in the 
direction of propagation. However, a finite beam of light that is 
circularly polarized possesses both angular momentum and 
vorticity in the direction of motion of the beam. For a beam of 
light in free space, the main contribution to both the vorticity 
and the angular momentum comes from the edge of the beam since 
it is here that abrupt changes take place in the field which is 
relatively constant inside the beam. For certain types of ellipti- 
cally polarized beams that are transmitted in a wave guide, a 
similar contribution to the angular momentum and the vorticity 
comes from the surface of the guide. 
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tion b, there exist two independent modes of trans- 
mission for an electromagnetic field propagating in the 
z-direction, the transverse electric and the transverse 
magnetic modes. For convenience we shall limit our 
discussion to the transverse electric mode, with the 
understanding that analogous statements hold for the 
transverse magnetic mode. 

For the transverse electric mode the components of 
the vector potential, A, are 


A,=ke(w/c) coskix sinksye**#—#9 , 
Ay=—ki(w/c) sinkix coskeye**#-«9 , 
A,=0, 

ko=an/b, kit+he-+h;?=w2/c?, 


(20) 


ki=2m/a, 
m, n=0, i, 2, tie a 
m+n> 1. 


For integral values of m and n these solutions satisfy 
Eqs. (1) and (2). They also satisfy the proper boundary 
conditions at the conducting surface, for A is always 
normal to the surface on the surface. The amplitudes of 
the two components, A, and A,, are functions of the 
coordinates x and y, and one may consider this wave as 
a linearly polarized vibration whose shape has been 
distorted by the introduction of the boundary. 

The electromagnetic fluid within the wave guide 
exhibits no vorticity, and indeed, the velocity vector 
for any such wave has exactly the same form as the 
velocity vector of elliptically polarized light in empty 
space ; the rays are normal to the surface z= constant. 

If we should now simultaneously excite another wave 
having the same wavelength, but which is ninety 
degrees out of phase with the original excitation, and 
with differing wave numbers, k;’ and ke’, then vortices 
lying in the x-y plane will appear in the electromagnetic 
fluid. For the appearance of vortices it is necessary 
that ,’ differ from k,, or ko’ from ke, and that all 
should differ from zero. This combined wave may be 
considered as an inhomogeneous, elliptically polarized 
vibration in the wave guide. 

When both waves are excited, the solution for the 
transverse electric mode becomes 


A= (w/c) (ke coskix sinksy 


‘ Bait ad 
+ik2! coski'x sinks’y)e*@#-#® , 


Ay= (w/c) (— hi sinkix coskey 


—iky’ sink;’x coske/y)e**-#4) | 
And, 

(21) 
ki=2m/a, ko=an/b, ki'=rm'/a, ko’ =an'/b, 
m,n 

es 1, 2, ee ocd 
m’, n! 


t+ he? +h? hy? hy! k= 08/2, 


RALPH SCHILLER 


The general form for the components of A is now 
A,z= Rye, 
Ay= Ree, 
A,=0, 


where R; and R2, and x; and x2, are real but unequal 
functions of the coordinates. Substitution of the values 
of A, and A, in the velocity vector, Eq. (14), shows 
that the terms containing the Clebsch parameters do 
not all vanish nor do they reduce to the gradient of a 
scalar function. Thus the curl of the velocity vector 
has nonvanishing components. 

It now becomes clear under what conditions body 
vorticity may arise in the wave guide. The wave must 
be bounded so that its amplitude will be distorted from 
the constant form it assumes in free space, and further- 
more, distorted in such a fashion that the amplitudes 
of at least two of the nonvanishing components of the 
vector potential differ from one another by an amount 
that varies from point to point in space. 

However, this distortion of the wave does not suffice 
for the appearance of vortices. Vorticity will only appear 
when there exists, simultaneously with distortion, a dif- 
ference in phase between the two nonvanishing com- 
ponents of the vector potential. This phase difference 
must also be a function of position. The excitation of 
the two waves ninety degrees out of phase with each 
other leads to the introduction of such a phase dif- 
ference. 

In order to visualize better the difference between a 
wave field without body vorticity and one with body 
vorticity, we may introduce the concept of circulation 
of the fluid. The circulation is defined as the line integral 
of the velocity vector taken around the closed curve S, 


c= g v-dr. 
Ss 


In the curl-free case the circulation will vanish, because 
every contribution to the integral from an infinitesimal 
segment of the curve S is canceled by an equal and 
opposite contribution from some other portion of the 
curve. This pair-wise cancellation no longer occurs 
when the phases of the A, and A, components of the 
vector potential differ by a function of position. In our 
example we need only define the circulation in the plane 
perpendicular to the z-axis because it is easily seen 
that any portion of the path in the direction of the 
z-axis will make no contribution to the vorticity. 

The examples that we may choose to illustrate the 
case of a velocity field with body vorticity are numerous. 
We have given this particular example because of its 
extreme simplicity. 

However, the rays that we have defined by means of 
the velocity field, Eq. (14), are in general not the same 
rays along which the energy of the electromagnetic 
field is transported. Energy is transported along the 
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lines of flow defined by Poynting’s vector and, as we 
shall see in Sec. IV, this vector differs from Eq. (14) by 
a term that is proportional to the curl of a vector. 
Hence the criteria that we have developed in the present 
section for the existence of vortices in the ray fluid are 
not complete, for if we are interested in the lines of 
flow associated with energy transport, we have to 
consider the vortex motion which is due to the curl 
term in Poynting’s vector. If we keep in mind that this 
term may contribute to the vorticity of the ray fluid 
even when the vortices defined by the Clebsch param- 
eters do not appear, then we may accept the conditions 
that we have presented in this section as necessary for 
the excitation of those vortices associated with the 
Clebsch parameters. To distinguish the vorticity arising 
from the curl term in Poynting’s vector from the body 
vorticity defined by the Clebsch parameters, we shall 
name the former intrinsic vorticity. An explanation for 
this distinction will appear in Sec. IV. 


Ill. THE GENERALIZED EICONAL AND THE 
EQUATIONS FOR THE CLEBSCH 
PARAMETERS 


We shall now deal with the problem of finding the 
six equations that are to replace the vector wave 
equation. The appearance of six equations, in place of 
the three original wave equations for the components 
of the vector potential, is due to our choice of solutions 
of the wave equation. We have already seen how in 
the scalar theory the requirement that the scalar solu- 
tion, w= R cos(k¢g—wt), be valid for all times, led to 
the replacement of the wave equation by the two equa- 
tions, the eiconal equation and the conservation law for 
rays. In a similar manner, if we assume that the com- 
ponents of the vector potential take the form, 


Rz cos(kg2—wt) }, (8) 


R, cos(kgi— wt) 
a-( 
R3 cos(kg3— wt) 


and that these solutions are valid for all time, there 
will be a doubling of the number of equations for each 
component of the vector equation, so that one finally 
has six equations as equivalent to the original three. 


One might equally well assume a complex space-time » 


dependence for each component of the vector potential, 
Rei(*e-«), with real amplitudes, R, and phase factors, ¢, 
and then require that the complex vector potential 
satisfy the wave equation. There will be a similar 
doubling of the number of equations because the real 
and imaginary parts of the wave equation must, be 
satisfied separately. 

In the scalar wave theory the substitution of the 
scalar wave function 4, 


u=R cos(kg—wt), (3) 


into the wave equation, led immediately to the con- 
servation law of ray density and to the eiconal equation. 
However, it is not at all convenient to use the same 
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method of simple substitution in the vector case. 
Straightforward calculation shows that the one equation 
that we have already obtained by other means, the 
conservation law for the rays, Eq. (11), is not at all 
a simple consequence of placing A in the form (8) in 
the wave equation. 

In place of the method of simple substitution we shall 
employ instead a variational principle in order to 
derive the six equations that we are seeking. A similar 
variational principle was used by Green and Wolf® as 
an alternative derivation of the eiconal equation and 
the conservation law for rays in the special case of 
polarized light. We shall extend this variational prin- 
ciple to the case of the most general velocity field of 
the electromagnetic fluid, and derive the conservation 
law for the rays, the generalized eiconal, and four other 
equations which describe the vortices in the medium. 
All these equations are nonlinear and give the usual 
wave theory the form of a complex fluid mechanics. We 
repeat once again that we intend to derive six equations 
of the electromagnetic fluid that will replace the vector 
wave equation. The fact that we have six equations is 
due to the particular form that we have assumed for 
the wave vector A in Eq. (8). We are not attempting to 
rederive the wave equation or Maxwell’s equations in 
their usual form from the well-known variational prin- 
ciple. We assume throughout the discussion that the 
vector A is chosen so that it satisfies the gauge con- 
dition. 

The variational principle of Green and Wolf employs 
a complex representation for the vector potential A. 
We shall write A in the form 


Ryetlk (ert eztes)—wt) 


A= (Reetvntrr ot) : 


Ryeil*(e1—e2te3)—wt] 


(22) 


where for convenience we have chosen the phase angles 
to be a particular linear combination of our old phase 
angles given in (8). The choice of this particular linear 
combination of phases is not significant for what 
follows. A large number of other choices would have 
served equally well, except that they would have led 
to a different definition of the Clebsch parameters that 
appear in the velocity field, Eq. (14). All of the various 
representations of the vector A, may be transformed 
into one another by Clebsch transformations, so that 
our special choice of the phase angles in (22) represents 
no real restriction on the theory. 

As the Lagrangian density for our variational prin- 
ciple we choose the time average of the usual Lagrangian 
density of the electromagnetic field, 
E—,H*-H). 


b= (Rt (23) 


6 The variational principle of Green and Wolf is similar to one 
devised by H. Bateman for dealing with the equations of fluid 
mechanics. See H. Bateman, Partial Differential Equations of 
Mathematical Physics (MacMillan Company, New York, 1932), 

64. 


p. 1 
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When we replace the field vectors by the complex 
vector potential A and make use of the gauge condition, 
our Lagrangian density becomes 


1_ aA* aA 
A--> ——). 


mi Ox; OX; 


(24) 


We now express the Lagrangian above as a function of 
the six new variables, p and gi, and the four Clebsch 
parameters ge, 3, £2, and £3. If the vector A is given by 
Eq. (22), the Lagrangian becomes 
L | = VBs'VBs | 1 (BsVEo+a2VEs)? 
=——} ¢€—-— i = 
16rt ul k%p? = 4RasBs 8k? Bs (East Esa) 


+(V¢1)*+ (V¢2)?+ (Vos)?+2V gi: Vote 





+2¥0r-Verts+2Ver-Vex(t—89 |, (25) 


where the a’s and §’s are the following functions of the 
Clebsch parameters: 


Bo= 1+ &, 
Bs=1+é3. 


In the short-wavelength limit, as k approaches infinity, 
the new Lagrangian is 


(VBs)° 1 (6;VE2+a2VEs)? 
=—_{e--—- >— 
167 pw! 4R?a383 8k? Bs(t203+ E302) 
+ (V¢e1)?+(V¢2)?+ (V¢3)?+2V gi: Vepoke 


. a=1—&, 


a= 1—&s, 





4 


pk? | 1 


$28.0 or-Vort2V 9 Vet) |} (26) 


The variation of the phase angle ¢; leads to Eq. (11), 
which we had previously interpreted as a conservation 
law, 


(k?/8xp)V-[o(Vert &2V got ésV ys) ]=0, (27) 


and which we had derived directly from the wave 
equation. p is considered as the ray density and the 
velocity of the ray fluid is given by 


v= (k?/8mp) (Vert £2V got ésV ¢3). 


The variation of thé density leads to the generalized 
eiconal : 


eu= (Vo1)*+(V¢2)?+ (Vos)?+2V or: V gots 
+2V 91: Vests +2V go: Ves(Eo+és— 1) 
1 [Vé(1+£s)+VEs(1— £2) ]? 


8k? (1+£3)[€2(1— £3) +&3(1—£:) ] 
1 Vé&3-Vé3 


ak? (1—&;%) 
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We call Eq. (28) the generalized eiconal because if two 
of the components of the vector A vanish, i.e., R:=0, 
R2=0, and g2=g3, we obtain the original eiconal, 
Eq. (6), 

(Vei)?= en. 


The equations for the four other parameters, £2, &;, g, 
and ¢3, describe the change in vorticity as the fluid 
moves along the stream lines. For example, the variation 
of ¢2 leads to an equation that can be given the follow- 
ing form when use is made of the conservation law (27): 

k? 1 
Via: v=—— -V-[pV y2(t2?— 1) 


8r, 
sa +pV o3(Est2— f2— Es— 1)]. 


This equation shows the rate of change of & along the 
stream lines. A similar equation holds for £3, and may 
be obtained by interchanging & and gs by é3 and ¢; 
in (29): 


(29) 


2 


Vés: v=—-V: [pV o3(&s?— 1) 
Siu p 
+pV o2(Eés— Eo—Es—1) ]. 


The rates of change of g2 and ¢s3 along the stream 
lines are similarly given by varying the Lagrangian 
with respect to & and &3: 


tse k? |-2. (83Vée+a2VEs)? 
pl6mu! 8k? Bs?(Eoas+ Esa) 2 
p Vés- (VE28st+VEsae 
4k? Ba(Eaca-+ Esa) 
1 pB3VEo+parVEs 
4k? ( feos} Et ) 


(30) 








4k? 
—2pV v2: (Vgei— £2V G2) — 2aspV go: Vos ,» (31) 


k? p Vés p (Vé&s)*é3 
Vo3:V= | -*v. —)—— 
pl6mul 2k? a3B3/ 2k? as? 


p Vé2- (BsVéeta2VEs) 
"4k Ba( Eos Esa) 
p (B3VEst+a2VEs)*(2Est+or2— 4é2ks) 
8k Ba? (Exar Eons)? 
wou ee) 
4k? 1 Bs East Esore 








—2pV os: (Vei-— Ex V go + V y2— EsV ys) . (32) 


Equations (27) through (32) are six equations for 
the six new variables that we have introduced to de- 
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scribe the electromagnetic field. Only four of these 
variables are independent since the vector A must 
satisfy the gauge condition. Thus we have really eight 
equations that, in the short-wavelength limit, have 
replaced the vector wave equation and the gauge con- 
dition. If a more exact theory is demanded, then one 
may use the true Lagrangian, L, given by Eq. (25), 
and derive six equations that replace the vector wave 
equation without any approximation. 


IV. POYNTING’S VECTOR 


In this section, we shall show that the definition of 
the current density, Eq. (9), 


J= (io/16mu)[(VA*)- A—(VA)-A*], (9) 


which is consistent with the wave equation for the 
vector potential A, is proportional to the time average 
of Poynting’s vector, except for a term which is the 
curl of a vector. 

The time average of Poynting’s vector, for complex 
vectors E and H which have the time dependence 
"ens is 


(S)= (c/16mr) (E*XH+ EXH*). (33) 


We assume that the electromagnetic field can be de- 
rived from a complex vector potential A, satisfying the 
gauge condition, divA=0, and having the same time 
dependence as the field vectors E and H. The equations 
relating the field variables to the vector potential are 


H=curlA/u, E=—(1/c)dA/dt. (34) 
In terms of the vector A, Poynting’s vector becomes 


(S)= (tw/16mu)[(VA*)- A— (VA). A* 
+VX(A*XA)]. (35) 


The momentum density of the electromagnetic field is 
just S/c so that the current density, J, differs from the 
time-averaged momentum density of the field by the 
term (1k/16mm) curl(A* X A). 

The decomposition that we have made of the time- 
averaged Poynting’s vector can be given an interesting 
physical interpretation. To do this we must first ask 
the origin of the curl term that appears in Eq. (35) for 
Poynting’s vector. We shall show now that it is equal 
to the mixed space and time components of the non- 
symmetric part of the canonical stress tensor. 

The canonical (unsymmetrized) electromagnetic field 
stress tensor is equal to 


(gra 0) 


ee 


| (36) 


"i 5.” 
T= —| GP" Coa P"hlthuahl 


dr 


where the electromagnetic field is defined by means of 
the antisymmetric tensor ¢,, which is the curl of the 
four-dimensional vector potential ¢,: 


Oy, 


OPp 
(gs=A, gs=—cy). 


Pur = oT 


Ox” =x 
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6.° is the four-dimensional Kronecker delta symbol, 
and we have used the usual summation convention. 

The last term in the definition of the stress tensor, 
Eq. (36), is the only part of the stress tensor that has 
antisymmetric components when the covariant tensor 
index, a, is raised to the contravariant position. It is 
customary to reject this term from the definition of the 
stress tensor so that the angular momentum of the 
field in empty space is conserved. Its rejection does not 
affect the integrated values of the field for.energy and 
momentum, which remain the same as long as the 
volume of space over which the integration is to take 
place is large enough so that the field falls off rapidly 
at the boundaries. 

A short calculation shows that the time average of 
this last antisymmetric term in the stress tensor is 
identical with the curl term in the time-averaged 
Poynting’s vector, Eq. (35). We take the time average 
of the 7** components of the stress tensor (the com- 
ponent s represents the spatial part of the tensor and 
the component 4 the time-like direction) and note that 
in the case of a pure radiation field the gauge condition, 
divA=0, and the condition g,=0 can be realized simul- 
taneously. (7“*) becomes 


(T)= (iw/16mu)[(VA*)- A—(VA)-A*+V 
X (A*X A) ]+ (to/16mu) VX (A*X A). 


It is immediately seen that the terms in brackets (the 
contribution from the symmetric part of the tensor) 
are exactly the terms appearing in the time average of 
Poynting’s vector (this is obviously true, for Poynting’s 
vector is the T“* component of the symmetrized stress 
tensor). On the other hand, the time average of the 
antisymmetric term in the stress tensor [the last term 
in Eq. (37) ] is equal to the second term in Poynting’s 
vector, the curl term of which we had spoken previously. 

The identification of the curl term in Poynting’s 
vector with the nonsymmetric part of the stress tensor 
permits us to make an interesting physical distinction 
between the curl term in Poynting’s vector and the 
remainder which is given by the current density, J, 
Eq. (9). This distinction is based upon an analogy with 
a physical interpretation that one may give to the stress 
tensor of the Dirac equation in quantum mechanics. 
For the Dirac equation, just as for Maxwell’s equations, 
the stress tensor, that defines the energy flow associated 
with the particles, contains a symmetric and a non- 
symmetric part (actually antisymmetric in the Dirac 
case). The antisymmetric term can be interpreted as 
the contribution to the energy stress from the spin 
motion of the Dirac fluid. The symmetric term is con- 
sidered as the energy stress due to the spatial flow of 
the fluid. 

In a similar way, we have been able to show that, on 
the average, Poynting’s vector, which is the symmetric 
part of the stress tensor, decomposes into our current 
density J and an additional curl expression. Our 


(37) 
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current density, J, can be interpreted as the part of 
the energy flux that is due to the general spatial motion 
of the ray fluid, while the curl term can be considered 
as due to the “intrinsic spin” of our effective fluid, for 
we have shown it to be equal to the time average of the 
nonsymmetric part of the stress tensor. Hence it be- 
comes clear why we have used the term body vortices 
to describe the vortex motion associated with the 
Clebsch parameters which appear only in the current 
density J, and intrinsic vorticity to describe the con- 
tribution to the circulation from the curl term. For the 
energy flow of the field the separation is physically 
interesting, but of course arbitrary, since the spin 
motion is intrinsically coupled with the body motion of 
the fluid. 

Intrinsic vorticity appears in the electromagnetic 
fluid when at least two nonvanishing components of 
the vector A have amplitudes which are general func- 
tions of position, although not necessarily different 
functions. The phases of both components must differ 
by at least a constant. Intrinsic vorticity may also 
appear when the two nonvanishing components of the 
vector A exhibit a phase difference that is a function of 
position. For a cylindrical wave guide, the simultaneous 
excitement of the lowest wave members for the trans- 
verse electric and the transverse magnetic modes, 
results in a vortex flow which takes the form of circles 
normal to the axis of the guide. Both intrinsic and body 
vorticity are present. 

Of some practical interest is the vorticity and angular 
momentum of a finite beam of circularly polarized light. 
Unlike the idealized wave of infinite extension, the 
finite beam exhibits both vorticity and angular mo- 
mentum in the direction of propagation of the beam. 
Let us assume that the beam of circularly polarized 
light is fairly homogeneous except at the edges where 
changes in the field take place as the intensity drops to 
zero. Then it is clear that the appearance at the edges 
of body vorticity and an additional rotational motion 
due to the intrinsic vorticity will give rise to a non- 
vanishing circulation in the plane normal to the direc- 
tion of propagation. This is strictly an edge effect in 
which the angular momentum of the beam can be shown 
to consist of two parts. One part is independent of the 
form of the beam, directly proportional to the total 
energy of the wave and inversely proportional to the 
frequency of the wave. This part of the angular mo- 
mentum arises from the curl term in Poynting’s vector 
(the intrinsic vorticity). The other part of the angular 
momentum is due to the current density J in Poynting’s 
vector. To show this clearly, we write the total angular 
momentum of the beam as 


1 
on f (rx $)dV, 
CvyV 


where r is the radius vector originating from the axis 
of propagation of this beam and normal to that axis. 
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After integrating by parts and neglecting surface terms, 
we have 


1 ' 
r=- f (exDav— : 
Vv 167k 


¢ Ld 


f (E*XE)aV. 
Vv 


The second term, — (i/16rku) f (E*X E)dV, is propor- 
tional to the total energy of the beam because the two 
nonvanishing components of the vector potential have 
equal amplitudes, but they are ninety degrees out of 
phase with each other. It is this term which has played 
a role in both the theory and the experiments® dealing 
with the angular momentum of light. However, the 
first term can also contribute to the total angular 
momentum but in this case it does so only at the edge 
of the beam, for it is only here that the current density 
J develops a component normal to the radius r and to 
the direction of propagation. The contribution to the 
angular momentum will thus be proportional to the 
total energy of the beam when the integration of the 
term that depends upon J makes a negligible contribu- 
tion to the total angular momentum. 

The question may be asked as to the physical differ- 
ence between the vector J and the Poynting vector §, 
as alternative definitions of the momentum density of 
the electromagnetic field. Actually, as is well known, 
there exists no unique definition for the energy and the 
momentum of the electromagnetic field. The choice of 
Poynting’s vector as the momentum density of the 
field is arbitrary, and in principle a large number of 
other choices might serve equally well. However, 
Poynting’s vector, or a similarly conserved quantity 
with the same transformation properties, must play 
the role of the field momentum, for the vector J de- 
scribes not the momentum of the field, but rather the 
stream of particles (we can call them photons) in the 
electromagnetic fluid. This distinction between the mo- 
mentum of the field and the current flow is of course 
known in other field theories such as Dirac’s electron 
theory or the vector meson theory. Similarly in electro- 
magnetic theory, because of the choice of gauge and 
the time-averaging, we can assume that electromagnetic 
particles (photons) move along the trajectories given 
by J, while the flow of energy of the field, as given by 
Poynting’s vector, moves along a similar path except 
that the energy flow contains an additional swirling 
movement. 

Finally, I should like to express my appreciation to 
Professor Peter G. Bergmann of Syracuse University 
who pointed out to me that the usual methods for 
passing from wave to ray optics failed when one had to 
consider a general vector solution of the wave equation, 
and to Professor David Bohm of the Universidade de 
Sao Paulo for stimulating discussion and helpful com- 
ments on a number of important questions. 

8R. A. Beth, Phys. Rev. 50, 115 (1936); W. Heitler, The 


Quantum Theory of Radiation (Oxford University Press, London, 
1944), first edition, p. 258. 
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Coefficient of Expansion of Liquid Helium 1* 
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The coefficient of thermal expansion of liquid helium 1 was measured from the \ point down to 0.85°K. 
Near the \ point the coefficient varies very rapidly with temperature and may tend to minus infinity at the 
\ point. The coefficient becomes positive below 1.15°K. The results are discussed in relation to Landau’s 


theory and a value for (p/A)(d8A/dp) deduced. 





1. INTRODUCTION 


HE coefficient of thermal expansion of liquid 
helium under its saturated vapor pressure has not 
hitherto been measured directly, but it may be derived 
from the density measurements of Onnes and Boks! 
along the vapor pressure curve, or by extrapolating the 
density measurements obtained at higher pressures by 
Keesom and Keesom.? From these two investigations 
one obtains several points on a density versus tempera- 
ture graph, and it is immediately obvious that there is 
a maximum near the J point and the coefficient of ex- 
pansion is positive above the A point but negative from 
the \ point down to 1.2°K. However, there are too few 
points for the coefficient to be derived accurately and, 
in particular, the interesting region in the immediate 
vicinity of the A point is inadequately covered, as the 
observations there are spaced 0.1°K apart. 

Several recent investigations have indicated the need 
for more accurate values of the coefficient of expansion. 
It is important in Pippard’s* theory of the attenuation 
of first sound near the X point, and is also needed to 
derive certain parameters used in the Landau-Khalatni- 
kov‘ theories of the viscosity of the normal component 
and the attenuation of first and second sound. It is also 
essential to a proper discussion of the nature of the 
\ transition.*~7 Moreover, it is now clear® that liquid 
helium is very similar to a Debye solid below 0.6°K 
and its coefficient of expansion should then be positive.’ 
Using recent measurements of the variation of the 
velocity of first sound with pressure, it is even possible 
to predict the magnitude of this positive coefficient.” 


* This research was assisted by grants from the National Re- 
search Council of Canada and the Research Council of Ontario. 

t Now at the Department of Physics, University of Pennsyl- 
vania, Philadelphia Pennsylvania. 

t Now at the Department of Physics, Royal Military College, 
Kingston, Canada. 
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The measurements described in the present paper 
originally extended down to 1.2°K, but the results 
suggested that the coefficient would change sign near 
1°K and we were therefore encouraged to extend the 
measurements to lower temperatures. The coefficient 
did, in fact, become positive below 1.15°K. 


2. METHOD 


The dilatometer, shown in Fig. 1, was completely 
immersed in a bath of liquid helium. The copper 
chamber A was filled with liquid helium by condensing 
in pure gas through the Monel tube D until the meniscus 
stood in the glass capillary C, and a valve at the room 
temperature end of D was then closed so that A, C, and 
D formed a closed system. The procedure was to make a 
small measured change in bath temperature and to use 
a cathetometer to measure the resulting change in 
position of the meniscus in C. The temperature of the 
bath was deduced from its vapor pressure as read on a 
butyl phthalate manometer, using the 1949 scale." 
In addition the carbon resistance thermometer R was 
used to detect changes in bath temperature as small as 
10-5 °K and this made it possible to maintain the tem- 
perature steady to better than 5X 10-5 °K by adjusting 
a fine needle valve in the pumping line. In a preliminary 
investigation, a differential oil manometer was used to 
show that the difference in vapor pressure between the 
liquid in A and in the bath was less than 0.2 mm of oil 
in the steady state, corresponding to a temperature 
difference of less than 0.0002 °K near the \ point. The 
temperature inside A was observed to follow a change of 
bath temperature within a time of the order of one 
minute. 

Several disturbing effects resulted from changes in 
the mass of gas contained in the dead space above the 
meniscus in C. As the bath level fell the mean tempera- 
ture of the Monel tube D slowly increased, the mass 
of gas contained in it slowly decreased and the meniscus 
drifted slowly upwards. It was difficult to make a satis- 
factory correction for this effect, so the dead space was 
reduced by inserting inside D a glass capillary with a 
bore of 0.1 mm, and the upward drift then became 
negligibly slow as long as the bath level was not allowed 
to fall below the bulb £;. For a similar reason, the bath 
temperature was changed by adjusting a needle valve 


1H. van Dijk and D. Shoenberg, Nature 164, 151 (1949). 
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Fic. 1. The dilatometer. A, copper chamber. C, glass capillary. 
P, and Ps, platinum tubes sealed into C. D, Monel tube leading to 
a valve at room temperature. R, carbon resistance thermometer. 


in the pumping line rather than by varying the power 
in a heater immersed in the bath. The latter procedure 
varied the amount of gas flowing up the cryostat and 
hence disturbed the temperature distribution along the 
Monel tube D, producing an erratic variation in the 
mass of gas in the dead space. The dead space at room 
temperature was about 5 cm’, but, being at a much 
higher temperature, contained only a small fraction of 
the total mass of gas and was unimportant as long as the 
room temperature was not allowed to change by more 
than 1°K per hour. After all these precautions had been 
taken the drift of the meniscus at a steady temperature 
was less than 0.1 mm per hour. 

An unavoidable effect of a similar nature still re- 
mained. As the temperature was changed in order to 
make an expansion measurement, the change in vapor 
pressure resulted in a change in the mass of gas in the 
bulb Z;. The necessary correction was evaluated by 
making measurements with the copper chamber A re- 
moved and the end of the platinum tube P» sealed off. 


Figure 2 shows how the position of the meniscus varied 
with temperature under these circumstances. The 
difference between this curve and a similar curve with 4 
in place was assumed to arise entirely from the ex- 
pansion of the liquid in A. This procedure had the addi- 
tional advantage that it eliminated the expansion of the 
liquid in the bulb Ep. 

The peculiar shape of the curve in Fig. 2 requires 
comment. The initial fall of the meniscus between K 
and L was caused partly by the negative coefficient of 
expansion of the liquid in the bulb EZ, and partly by the 
evaporation of liquid into the dead space to increase 
the vapor pressure. The subsequent rise from L to M 
was much more pronounced before the dead space was 
reduced and it may.be explained in the following way. 
Film flow up the outside of D cooled the dead space im- 
mediately above the surface of the bath, but, as the 
rate of film flow decreased on approaching the X point, 
the volume of dead space cooled in this way decreased 
and some of the gas condensed out, producing the rise 
in the meniscus. The exact position of the A point was 
determined with the aid of the following phenomena. 
The bath temperature was allowed to rise slowly and 
the reading of the carbon resistance thermometer was 
continually observed. Below the \ point the tempera- 
ture was uniform throughout the bath and the tempera- 
ture of the resistor followed the temperature at the 
surface, but as soon as the A point was reached a vertical 
temperature gradient began to establish itself and the 
temperature in the vicinity of the thermometer rose 
rapidly. The bath was then cooled again and steadied 
down at a temperature about 5X10-> °K below the 
\ point as determined by the above method. It was then 
observed that, even though the vapor pressure of the 
bath and the reading of the carbon resistance thermom- 
eter were both constant, there was a steady fall of the 
meniscus as shown by the portion MW of the curve in 
the inset to Fig. 2. Presumably, the good thermal con- 
ductivity of the liquid column in the capillary C had 
been destroyed so that, even though the temperature 
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Fic. 2. The variation of the meniscus position with temperature in 
the absence of the chamber A. Capillary 2. 
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of the liquid in EZ, remained constant, the liquid near 
the meniscus was slowly warmed by the influx of stray 
heat and evaporated into the dead space. Similar phe- 
nomena were observed with the copper chamber A 
in position and the onset of the downward drift of the 
meniscus was subsequently used to indicate that the 
liquid in A had reached the d point. 

Two corrections were applied to obtain the final 
results. As the meniscus rose in the capillary it dis- 


placed gas which condensed out and produced a small — 


additional rise. The necessary correction was never 
greater than 1 percent and could be estimated with 
sufficient accuracy from the known density of the gas. 
Also, as the length of the liquid column in the capillary 
changed, the hydrostatic pressure on the liquid in A 
changed sufficiently to alter its density significantly, 
since liquid helium has a high compressibility. For 
the finest capillary used. this effect required a correction 
as high as 3} percent. 

The measurements below 1.2°K were made in a 
special cryostat which has been described elsewhere.” 
In this case the glass capillary C was 60 cm long and 
extended up to liquid air temperatures. The tempera- 
ture was measured by a carbon resistance thermometer 
which had been calibrated against the magnetic sus- 
ceptibility of chrome alum. 

The apparatus proved unsuitable for measurements 
in the helium 1 region because of the disturbing effect 
of bubbles in the capillary C and the difficulty of ob- 
taining a uniform temperature throughout the system. 


TaBLeE I. 0.85°K to 2.05°K. 
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2 We are grateful to Dr. A. B. Pippard for pointing out the im- 
portance of this correction. 
8 Atkins, Edwards, and Pullan, Rev. Sci. Instr. (to be published). 


TABLE II. The vicinity of the \ point. 
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3. RESULTS 


Three glass capillaries and two copper chambers 
were used. The internal diameters of the capillaries 
were: Capillary 1, 0.1016+0.0006 cm; Capillary 2, 
0.0489+-0.0002 cm; Capillary 3, 0.0465+-0.0007 cm. 
Chamber 1 had a volume of 29.27+0.15 cm* and Cham- 
ber 2 a volume of 14.68-+0.07 cm’. All these figures in- 
clude small corrections for the contraction between 
room temperature and liquid helium temperatures. 
Above 1.2°K measurements were made with three com- 
binations, Capillary 1 and Chamber 1, Capillary 1 and 
Chamber 2, and Capillary 2 and Chamber 1, and all 
three sets of measurements agreed within the experi- 
mental error, suggesting that no important precaution 
or correction had been ignored. The temperature in- 
tervals were usually chosen so that the meniscus 
moved through a distance of the order of 1 cm and 
altogether more than 200 separate observations were 
made above 1.2°K. A curve was drawn through all these 
points and the resulting smoothed values of the expan- 
sion coefficient were estimated to have a random error 
of 1 percent. These smoothed values are collected in 
Tables I and II in the column headed a,. The measure- 
ments below 1.2°K are also given in Table I and are 
shown graphically in Fig. 3. Although these latter 
measurements have a random error of 5 to 10 percent, 
they demonstrate convincingly that the coefficient 
changes sign near 1.15°K. 

The quantity a, measured in these experiments is the 
coefficient of expansion as both the temperature and 
pressure change along the vapor pressure curve. 
Various other quantities which may be derived from a, 
are also included in Tables I and II. The coefficient of 
expansion at constant pressure, a», may be obtained 


from 
Y (% 
ay=a+—(—) (1) 
puy\dT/, 


where (dp/dT), is the slope of the vapor pressure 
curve." For the velocity of first sound, #:, we used the 
measurements of Atkins and Chase® increased by 0.8 
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Fic. 3. The coefficient of expansion near 1°K. O Chamber 1, 
Capillary 3. x Chamber 2, Capillary 3. @ Results obtained pre- 
viously above 1.2°K. ---- Phonon contribution. De- 
rived from straight line in Fig. 5. 


percent as suggested by Chase." y is the ratio of the 
specific heats at constant pressure and constant volume 
and may be taken as 1.00 without introducing any sig- 
nificant error in the correcting term. The actual value 
of y can then be deduced from 


y—1=Tuya,?/C>. (2) 


C,, the specific heat at constant pressure, is equal to 
C,, the measured specific heat under the saturated 
vapor pressure, to within 0.1 percent, but unfortunately 
the various measurements of C, disagree by about 10 
percent, and so there is a corresponding uncertainty in 
y—1. The isothermal compressibility, Kr, can be 
obtained from 


Kr=7/pu;’, (3) 


with an error of the order of 2 percent arising mainly 
from «;?. The values quoted for all these quantities 
correspond to pressures and temperatures on the vapor 
pressure curve, but, from a theoretical point of view, 
it may be more correct to use values of the coefficient 
of expansion along a line of constant density, a,’. To 
correct for small changes of density, one uses 


Aay= (dap/dp)r(dp/dp) rAp 


3 —, (4) 
oT /» p 


Kr 


1 (— Ap 


The values of a,” in Table I correspond to the density 
4 C, E. Chase, Proc. Roy. Soc. (London) A220, 116 (1953). 


at 0°K and zero pressure, while the values in Table II 
are corrected to the density on the vapor pressure 
curve at 2.180°K. 


4. DISCUSSION 
4.1 The d Transition 


Ehrenfest’s®> treatment of the \ point as a second- 
order transition assumes that the Gibb’s free energy 
and its first-order derivatives are continuous, but that 


‘there are finite discontinuities in the second-order 


derivatives Cy, ap, and Kr. He then deduces that the 
slope of the J line is 


(2) =oc"—c,0/Te"-a) 6 
lillie abar dig 


= (ap"'—ay')/(Kr'—Kr"), (6) 


where X™ is obtained by extrapolating the quantity X 
to the A point on the liquid helium u side and X! is 
obtained by a similar extrapolation on the liquid he- 
lium 1 side. Figure 4 shows the behavior of a,? in the 
neighborhood of the \ point and it is obviously varying 
so rapidly that it is impossible to make a satisfactory 
extrapolation in order to deduce a,!. The same diffi- 
culty is experienced with Kr (Table II) and C,.'* We 
must therefore abandon the attempt to verify the 
Ehrenfest relations on the basis of present experimental 
evidence. 

Tisza,!” however, has presented an alternative treat- 
ment of d transitions which assumes that some of the 
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Fic. 4. The vicinity of the \ point. @ Chamber 1, Capillary 1. 
x Chamber 2, Capillary 1. + Chamber 1, Capillary 2. ay? 
=0.0008+-0.0148 log(7),— 7). 


18 P. Ehrenfest, Proc. Roy. Acad. Amsterdam 36, 153 (1933). 

16 W. H. Keesom and A. P. Keesom, Physica 2, 557 (1935). 

17L. Tisza, in Phase Transformations in Solids, edited by R. 
Smoluchowski (John Wiley and Sons, Inc., New York, 1951), p. 1. 
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second-order derivatives of the Gibb’s free energy tend 
to infinity at the transition temperature. This is con- 
sistent with the behavior of C,!* and Kr® and is now 
seen to be equally consistent with the behavior of a, 
(Fig. 4). In fact, between 2.1°K and the d point, the 
coefficient of expansion may be represented by 


ap? = +0.0008+-0.0148 log(7,—T), (7) 


which tends to minus infinity at the point (full curve 
in Fig. 4). The form of Eq. (7) is of particular interest 
because Onsager'® has discussed a two-dimensional 
model of a ferromagnetic and has shown that the 
specific heat in the vicinity of the Curie point is 
C,=A log(T-—T)+B. In the present case it is rele- 
vant to enquire whether there should not be additional 
terms representing, for exampie, the contribution of 
the phonons, but these terms would probably vary 
slowly in the vicinity of the point. 


4.2 The Phonon Contribution 


The coefficient of expansion at constant pressure is 


1 /aV 1 aS 
«,-—(—) si --(—) (8) 
V\o0T p V Op T 
In terms of Landau’s theory,” the entropy S may be 


written as the sum of roton and phonon contributions, 
so that 


1 ( 1 /aS, 
AMG 

V\ dp 77 VN Op/ 7 
=Apnta;. (9) 


1675k*T* 
ph— ? 
45h8c3p 


(10) 


the phonon contribution to the coefficient of expansion is 


16n°k‘T* 71 dc 1 
ue (- —+-Kr). (11) 
15h’ \c dp 3 
The velocity, c, of the phonons was assumed to be the 
same as the velocity of first sound extrapolated to 0°K, 
or 239+-2 m/sec.®* With the assistance of Dr. Gotlieb 
and Dr. Chung of the University of Toronto Computa- 
tion Center, the quantity dc/dp was evaluated by fitting 
the power series c=cotaip+a2p’+--- to the data 
of Atkins and Stasior® at 1.21°K. The results 
were a= (7.64+0.13)X10-* cm*® sec! dyne and 
a,= (—1.634+0.12) X10-" cm® sec"! dyne~. Equation 

(11) then reduces to”! 


opn= (+1.08-+0.04) X 10-*7 deg-. 


18L. Onsager, Phys. Rev. 65, 117 (1944). 

%L, Landau, J. Phys. (U.S.S.R.) 5, 71 (1941). 

*”L. Landau, J. Phys. (U.S.S.R.) 11, 91 (1947). 

*1 This equation differs slightly from an earlier estimate (ref- 
erence 10) because of the more accurate evaluation of dc/dp. 


(12) 
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This is plotted as the dashed curve in Fig. 3. Below 
0.6°K the entropy is almost entirely due to the phonons® 
and the total coefficient of expansion probably tends 
asymptotically to Eq. (12) at sufficiently low tem- 
peratures. 

In the next section, we shall assume that Eq. (12) 
represents the phonon contribution up to 2.0°K. This 
ignores the variation of c and dc/dp with temperature 
and introduces uncertainties of the order of 20 percent. 
The Debye temperature is 19.8°K and departures from 
the T* law therefore amount to only 2 percent at 2.0°K 
and have been neglected. Differences between the actual 
spectrum of the normal modes and the Debye spectrum 
might also be important but cannot be evaluated at 
present. These considerations are relevant because, 
even at 2.0°K, a», has the same order of magnitude as a,. 


4.3 The Roton Contribution 
According to Landau, the roton entropy is 
Quik perA 


-=————-(143kT/A) exp(— A/T). 
nytt +3kT/A) exp(—A/kT) 


(13) 


Differentiating with respect to the pressure, one obtains 
2k perA 3kT 
= ae 1+—) exp(—A/kT) 
(2) pT hu? 2A 


Fe 2p0po pod poA A 


2udp podp Adp ~ Adp kT 
3(kT/A)? 
x{1+———_| | (14) 
1+3kT/2A 














Fic. 5. Test of Landau’s theory. 
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Fic. 6. The increase in density caused by the rotons, dp,, plotted 
against the fraction of normal component, p,/p. 


or, alternatively, 


(15) 


OU? A 
_— = @ 


$(kT/A)? | 
S, kT ‘ 


1+3kT/2A 


1 p ou 


qa@=-- 


 2pa 


p OA 


A dp 


Ph Opo p OA 
po Op 


? 


A Op 


In Fig. 5, Y= —a,u;"/S, has been plotted against 
A 3(kT/A)? 
RTL 143kT/2A 


a, was deduced by subtracting the values of a,, given 
by Eq. (12) from the values of a,’ in Table I, the coeffi- 
cient of expansion along a line of constant density being 
preferred since it is then more likely that A, yu, and po 
are independent of temperature. Khalatnikov’s* value 
of A/k=8.9°K was used, and S, was calculated from 
Eq. (13) with u and o chosen so that the total entropy 
at 1.6°K was 0.314 joule g“ deg“ in agreement with 


some recent measurements of Hercus and Wilks.” 
Below 1.6°K a straight line can be fitted to the points 
in Fig. 5, yielding 

p OA 

— —=—0.57+0.06, 

A op 


p Opo p OA 
2 +-— —= —0.95+0.2. 
A dp 


(16) 


1 p Ou 


2 u Op 


(17) 


po Op 


Above 1.6°K the theory might be expected to break 
down because, as the number of rotons and phonons 
increases, the interactions between them become more 
important. Khalatnikov‘ quotes (p/A)(dA/dp)~—} 
but gives no details of his calculations. 

It will be seen that the negative coefficient of expan- 
sion just below the A point is a consequence of the fact 
that (p/A)(0A/dp) is negative and the resulting nega- 
tive value of a, is numerically greater than the positive 
value of ays, although not overwhelmingly so. However, 
a, decreases exponentially with falling temperature 
[Eq. (14) ] whereas ay», varies only as T* [Eq. (12) ], and 
so, at a sufficiently low temperature, ay, begins to 
predominate and the total coefficient of expansion 
becomes positive. Theories, such as that of Feynman,” 
which give an atomistic explanation of the nature 
of rotons could be tested by comparing their predictions 
of (p/A)(04/dp), (o/u) (u/dp), and (o/po) (9p0/dp) with 
Eqs. (16) and (17). 

Integrating Eq. (9) with respect to the temperature, 
the density is obtained in the form 


P= pot dppnt Spr. (18) 


Figure 6 demonstrates that dp, is very nearly linearly 
proportional to p,»/p, the fraction of normal com- 
ponent. It is difficult to say whether this has any 
fundamental significance, but it is interesting because 
the approximate proportionality between S, and p,/p 
has long been recognized.*-*6 


2 We are grateful to Dr. J. Wilks for information on his entropy 
measurements in advance of publication. As his values are 10 
percent higher than previous ones, there is this amount of un- 
certainty in S,. 

%R. P. Feynman, Phys. Rev. 91, 1291, 1301 (1953). 

% F. London, Revs. Modern Phys. 17, 310 (1945). 

25 L, Tisza, Phys. Rev. 72, 838 (1947). 

26 Gorter, Kasteleijn, and Mellink, Physica 16, 113 (1950). 
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This is the first of two papers dealing with a systematic study 
of the linearized, unbounded medium problems in magneto- 
hydrodynamics of incompressible and compressible fluids. Part I 
deals with the fundamental equations which are set up quite 
generally for an ideal, homogeneous and isotropic, conducting fluid 
devoid of viscosity and expansive friction, subject only to the 
initial assumption that the externally applied field of magnetic 
induction be constant and uniform. The energy and momentum 
balance in a magneto-hydrodynamic field is verified with the aid 
of the exact fundamental equations and the conservation laws of 
energy and momentum, for a rigid volume fixed in the (stationary) 
observer’s inertial frame of reference, are displayed in differential 
and in integral form. By successive eliminations there is obtained 
a single partial differential equation in the particle velocity from 


which the unwanted second-order terms are merely dropped in a 
linearized small amplitude theory, a process which is fully justified 
by considering the special case of infinite conductivity, zero dis- 
placement current, and incompressible fluids. Also, assuming that 
a particular solution of the linearized magneto-hydrodynamic wave 
equation has been obtained, it is shown how to compute quite 
generally, from the linearized Maxwellian set, the accompanying 
electromagnetic field vectors expressed in terms of the assumed 
velocity field. These computations are carried out for plane homo- 
geneous waves and for time-harmonic cylindrical waves. The 
actual determination of particular wave functions appropriate for 
incompressible and compressible fluids, together with the computa- 
tion of the corresponding wave numbers, is reserved for the sequel 
to this paper, Part IT. 





1. INTRODUCTION 


HE field of magneto-hydrodynamics, like hydro- 
dynamics itself, is essentially nonlinear, for the 
interaction between a moving conducting fluid and the 
electromagnetic field also contains nonlinear terms. The 
importance of the new field, especially in cosmic physics, 
has been attested by a score of papers on various sub- 
jects such as solar physics, cosmic radiation, stellar 
oscillations, geomagnetism, propagation in an ionized 
atmosphere, etc., in which nonlinear phenomena are 
very much in evidence. A brief account of these re- 
searches, together with a complete bibliography, has 
been given by Lundquist! in an excellent review paper. 
In this paper, except where noted, we confine our atten- 
tion to the important class of linearized problems which 
give rise to time harmonic magneto-hydrodynamic 
waves in compressible and incompressible fluids. 

The underlying fundamental notions in the theory of 
magneto-hydrodynamic waves in an incompressible 
fluid were originally given by Alfvén? in the course of his 
researches on the theory of sun-spots. The theory of 
magneto-hydrodynamic waves was first considered in 
some detail by Walén* who set up the magneto-hydro- 
dynamic equations starting with the principle of con- 
servation of energy. Laboratory experiments in magneto- 
hydrodynamic waves in mercury have been reported by 
Lundquist,‘ and more sated using liquid sodium by 
Lehnert.5 


* This research was supported by the U. S. Air Force, through 
the Office of Scientific Research of the Air Research and Develop- 
ment Command. 

1S. Lundquist, Arkiv Fysik 5, 297 (1952). 

*H. Alfvén, Nature 150, 405 (1942); Arkiv Mat. Astron. Fysik 
B29, No. 2 (1942). See also H. Alfvén, po oe 
(Oxford University Press, London, 1950), C 

°C, Walén, Arkiv. Mat. Astron. Sat 4, Be 15 (1944). 

‘S. Lundquist, Phys. Rev. 76, 1805 (194 9). 

B. Lehnert, Phys. Rev. 94, 815 (1954). 


Waves in an ionized gas in the presence of a magnetic 
field have been considered by Astrém* and magneto- 
hydrodynamic waves in a compressible fluid of infinite 
conductivity have been studied by Herlofson.’ A more 
systematic account of plane magneto-hydrodynamic 
waves including the effects of finite conductivity, vis- 
cosity, and compressibility of the medium is found in a 
paper by van de Hulst.* Time harmonic cylindrical 
waves in compressible and incompressible fluids have 
been considered by Lundquist.’ Also, in a recent paper, 
Hines® develops some generalized magneto-hydro- 
dynamic formulas, using an extension of the magneto- 
ionic approach, which are applicable where a purely 
macroscopic point of view is no longer tenable. 

However, nowhere do we find a complete account of 
the fundamental equations without simplifying assump- 
tions injected from the outset, nor a systematic analysis 
of the linearized, unbounded media, and boundary value 
problems in the field of magneto-hydrodynamics of 
incompressible and compressible fluids. In this paper, 
Part I, we propose to fulfill this need by first giving a 
detailed discussion of the general linearized theory of 
magneto-hydrodynamic phenomena in an unbounded, 
homogeneous and isotropic, conducting fluid embedded 
in a constant and uniform field of magnetic induction, 
and then examining in general the structure of plane 
homogeneous waves and of time-harmonic cylindrical 
waves. The application of the theory to the specific 
cases of incompressible and compressible fluids and the 
actual determination of the fundamental wave functions 
corresponding to all possible modes of propagation is 
reserved for the sequel to this paper, Part II. 


° E. Astrém, Nature 165, 1019 (1950). 

™N. Herlofson, Nature 165, 1020 (1950). 

*H. C. van de Hulst, Problems of Cosmical Aereodynamics 
(Central Air Documents Office, Dayton, 1951), Chap. 6. 

® Reference 1, Sec. C. 
( 955). O. Hines, Proc. Cambridge Phil. Soc. 49, Part 2, 299-307 
1953 
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2. FUNDAMENTAL EQUATIONS 


The systematic study of the linearized, unbounded 
medium problems in magneto-hydrodynamics requires 
the simultaneous solution of the Maxwellian equations 
for a moving medium and the Eulerian equations of 
motion of the fluid in the presence of the ponderomotive 
force density of electromagnetic origin. To simplify the 
analysis from the outset, we assume a homogeneous and 
isotropic conducting fluid of infinite extent embedded in 
a uniform magnetic field. Furthermore, we consider only 
ideal fluids devoid of viscosity and expansive friction. 
Finally, since our approach is purely macroscopic, we do 
not consider phenomena in ionized gases at low densities. 


2.1 Maxwellian Equations 


First, we assume that the homogeneous and isotropic 
medium is characterized, in rationalized mks units, by 
the rigorously constant macroscopic parameters" y, e€, 
and o. Furthermore, we adopt here Minkowski’s rela- 
tivistic electrodynamics of moving bodies, according to 
which the Maxwellian set becomes” 


(1) VXe+u(dh/a)=0, (II) V-h=0, 
(II) VXh—e(de/dt)=j, (IV) V-e=n/e, 
(V) j=nv+o(e+vxXB), 

(VI) 6n/dt+V-j=0, 


where e and h represent the electric and magnetic 
intensities of the induced field, j the current density in 
the medium, v the velocity of the fluid, and B the total 
magnetic induction prevailing at a point and repre- 
senting the sum, B=Bo+uh, of the externally applied 
field Bo (which is assumed constant and uniform 
throughout this investigation) and the induced field uh. 
Equation (IV) defines the electric charge density by the 
unconventional symbol 7, and (V) exhibits the current 
density vector as the sum of the convection current nv 
plus the conduction current o(e+vXB). Finally, (VI) 
expresses the principle of conservation of charge (equa- 
tion of continuity) that governs the behavior of the 
charge and current densities. 

It must be clearly understood that, in (I-VI), all field 
vectors are referred to the (stationary) observer’s 
inertial frame of reference in which the fluid is moving 
with the instantaneous velocity v= v(r,/), where r is the 
position vector of the point of observation referred to 
the said inertial frame. Admittedly, the set (I-VI) 
applies rigorously only to uniformly moving bodies and 
its application to more complicated kinds of motion may 
be regarded at least as a first approximation. Fortu- 


(1) 


11Jn this paper we assume a fluid with the permeability and 
dielectric constant of vacuum, i.e., we=c~?. The extension of the 
theory to media having more general electric and magnetic 
properties can be readily made, but is not considered here. 

12 See, for example, R. C. Tolman, Relativity, Thermodynamics, 
and Cosmology (Oxford University Press, London, 1934), Sec. 52; 
C. Mller, The Theory of Relativity (Oxford University Press, 
London, 1952), Sec. 73. 


nately, however, if the fluid motion is nonrelativistic 
(vc) and if the accelerations produced by the electro- 
magnetic forces are small, which is the case in the 
present instance, the set (I-VI) adequately describes 
the electromagnetic field associated with (slow moving) 
accelerated bodies.” ’ 


2.2 Eulerian Equations 


For an ideal conducting fluid devoid of viscosity and 
expansive friction, the hydrodynamic equations of 
motion and the equation of continuity (conservation of 
mass) are 

(VII) p(dv/dt)+Vp=; 

(VIII) dp/dt+V- (pv) =0, 
where p is the density of the fluid, » the hydrodynamic 
pressure, and f is the ponderomotive force density which, 
in the assumed absence of gravity, must be equated to 


the Lorentz force density acting on the charge and 
current distribution, i.e., 


f=ne+jxXB, (3) 


in which e is the electric intensity and B the fofal field of 
magnetic induction. In Euler’s Eq. (VII), the total 
time derivative of the velocity is given by 


dv/dt= dv/dt+ (v- V)v=dv/dit+3V?+(VXv)Xv, (4) 


in which the latter form is invariant. 


(2) 


2.3 Energy and Momentum Balance 


The system of equations (I-VIII), together with Eq. 
(3), are sufficient to describe completely the behavior 
of a magneto-hydrodynamic field. In particular, it is 
instructive to verify from these equations, which of 
course apply to the complete nonlinear theory, that the 
phenomenon is governed by the laws of conservation of 
energy and momentum. To this end we consider first the 
power per unit of volume developed by the pondero- 
motive force density of electromagnetic origin; that is, 
introducing the conduction current density 


J=j—nv=0(e+vxB), (5) 
we compute from (3) the power per unit of volume 
f-v=n(e-v)—j- (vXB) = —J?2/c+j-e, (6) 


in which the second form is obtained from the first by 
eliminating vXB with the aid of (5). Next, we replace j 
in the last form of (6) by the left member of (II) and, 
making use of (I), we obtain finally 


f-v=—J?/o— (9/01) fee+3ul?]—V-(eXH), (7) 
which expresses the power per unit of volume in terms of 
the conduction current density J and of the electro- § 
magnetic field vectors e and H, where H is the otal 
magnetic intensity, H=Ho+h. 


a C. Tolman, reference 12, p. 101; C. Mller, reference 12, 
p. i ; 
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Proceeding similarly, we can also express the Lorentz 
force density itself in terms of the electromagnetic field 
vectors. Introducing for the purpose the total Maxwell’s 
electromagnetic stress tensor'* 


T= eLee—Fe°S ]+u [HH — 3273), (8) 


where <¥ denotes the idemfactor, and taking its (tensor) 
divergence we have 


V-T,=eLeV-e+ (VXe) Xe] 
+zhlHvV-H+(VxXH)xH], (9) 


with the aid of which we obtain directly from (3), 
making use of (II) and (IV) to eliminate respectively 
the current and charge densities, the desired expression 
for the Lorentz force density 


f=V-T,—pe(d/dt)(eXH), (10) 


in which there appear only the electromagnetic field 
vectors e and H. 


Energy Balance 


To verify the energy balance, we take the scalar 
product of v and the vectors on both sides of (VII) to 
obtain 

pv: (dv/dt)+v-Vp=f-v. (11) 


Next, making use of (4), we note that the first term 
above can be written as 


pv: (dv/dt) =pv- (dv/dt)+4pv- V2? 
= (0/dt) (Zp) +V-(Zpr*v), (12) 


in which the latter form is deduced from (VIII). Simi- 
larly, the second term may be written as 


v:-Vp=V: (pv)—pV-v=V: (pv)+ (p/p) (dp/dt), (13) 


in which again the latter form is deduced with the aid of 
the equation of continuity. 

Substituting into (11) the forms (12), (13), and (7) 
and transposing terms, we obtain finally the equation 


(0/dt) (Spv”) + (0/dt) (Zee?-+-3uH”) 
= —J?/a— (p/p) (dp/dt) 
—V-LexH-+ (30+ p)v], 


which represents in differential form the conservation of 
energy ina magneto-hydrodynamic field per unit volume 
fixed in the observer’s inertial frame of reference. In 
fact, the terms on the left side of (14) represent the time 
tate of increase of the total energy (kinetic plus electro- 
magnetic) stored per unit volume, and this rate of in- 
crease must be accounted for by the terms on the right 
side. Thus, the term —J?/o represents the rate of Joule 
heat Joss per unit volume, which is an irreversible 
process, whereas the term —(p/p)(dp/di) may be in- 


(14) 


“W. Heitler, The Quantum Theory of Radiation (Oxford Uni- 
versity Press, London, 1944), second edition, p. 7; see also J. A. 
Stratton, Electromagnetic Theory (McGraw-Hill Book Company, 
Inc., New York, 1941), Secs. 2.5 and 2.6. 
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terpreted as the (reversible) rate of doing work per unit 
volume associated with pressure fluctuations. And, 
finally, the divergence term on the right side of (14) is 
readily interpreted as the rate at which energy flows 
through the walls into the unit volume, the flow con- 
sisting of electromagnetic energy and total mechanical 
energy (kinetic plus potential). Equation (14) was used 
by Walén' as the starting point for his derivation of the 
magneto-hydrodynamic equations, except that Walén’s 
equation, as written, is correct only for incompressible 
fluids (V-v=0) and only if one replaces the total time 
derivative of the electromagnetic energy density by the 
partial derivative. 

To clarify the above interpretation of the energy 
balance, suppose we multiply both sides of (14) by the 
element of volume dr and integrate throughout a rigid 
volume fixed in the observer’s inertial frame of reference. 
In this way, making use of the divergence theorem, we 
obtain 


d 
< f Chov-+ (Geet Sut) ir 
dt vy 


J? d 
mieceye 
ylo p\dt 
- f n-[exXH-+ (Jov?-+p)vda, (15) 
Ss 


which expresses in integral form the conservation of 
energy for a fixed volume in a magneto-hydrodynamic 
field. Thus, the volume integral on the left represents 
the time rate of increase of the total (kinetic plus 
electromagnetic) energy stored within the fixed volume, 
and the volume integral on the right accounts, re- 
spectively, for the irreversible Joule heat loss through- 
out the volume and for the reversible rate of doing work 
associated with pressure fluctuations. And, finally, the 
surface integral on the right measures the time rate of 
influx of electromagnetic and total mechanical energy 
through the walls of the fixed volume. 


Momentum Balance 


To establish the momentum balance we need only 
refer to (VII), noting that the first term on the left may 
be written as 


p(dv/dt) =p(dv/dt)+p(v-V)v 

= (0/dt)(pv)+V-(evv), (16) 
where the latter form is deduced with the aid of (VIII). 
Then, replacing Vp by V- (pS), where 9 is the idemfactor, 
and substituting into (VII) the forms (16) and (10), we 
obtain after transposing terms 


(0/dt) (pv) +me(9/dt) (eX H) 
=V-[Z.—(ev)v—pS], (17) 
15 C, Walén, Arkiv Mat. Astron. Fysik A30, No. 15, 2 (1944). 
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which represents in differential form the conservation of 
momentum in a magneto-hydrodynamic field per unit 
volume fixed in the observer’s inertial frame of reference. 
In fact, the terms on the left side of (17) express the 
time rate of change of the total mechanical plus electro- 
magnetic momentum contained in a unit volume and 
this must be equal to the force acting on the matter and 
the electromagnetic field within the unit volume as ac- 
counted for by the divergence term on the right. Thus, 
this force which acts through the walls of the unit 
volume is seen to consist of three terms: the electro- 
magnetic stresses, the influx of matter carrying mo- 
mentum, and the net force due to the pressure acting at 
right angles to the walls of the unit volume. 

To gain further insight into the momentum balance, 
it is instructive to integrate both sides of (17) through- 
out a rigid volume fixed in the observer’s inertial frame 
of reference. Thus, making use of the (tensor) divergence 
theorem, we obtain 


7 (eX H) ]d 
zd rte eXH) |dr 


= fn Zao- f (a-vorda— f mpdo, (18) 
8 


which expresses in integral form the conservation of 
momentum for a fixed volume in a magneto-hydro- 
dynamic field. The volume integral on the left represents 
the time rate of change of the total mechanical plus 
electromagnetic momentum contained within the fixed 
volume and therefore must be equal to the total force 
acting on the matter and the electromagnetic field 
within the volume. This force is fully accounted for by 
the three surface integrals on the right of (18). The first 
surface integral denotes the force acting on the fixed 
volume which arises from the electromagnetic stresses 
across the bounding surface; the second surface integral 
accounts for the influx of matter carrying momentum 
across the walls of the fixed volume and may be inter- 
preted as the force resulting from the impact of the 
moving fluid on the bounding surface; and the third 
surface integral is merely the net force acting on the 
fixed volume by virtue of the normal pressure on the 
walls of the volume. 


2.4 Reduction to One Fundamental Equation 


In order to solve a given magneto-hydrodynamic 
problem we would like to eliminate from the system 
(I-VIII) all but one of the dependent vector variables, 
but this is impossible in general because of the com- 
plexity of the equations and because of nonlinearity. 
However, it proves possible to obtain quite generally a 
single vector partial differential equation in the fluid 
velocity v in which there remain only a number of 
unwanted second-order terms that one eventually 
ignores in a linearized theory. To this end we first take 
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the curl of (I) to obtain, making use of (II), 
[V?—pe(d?/df) Je=u(dj/dt)+VV-e, — (19) 


from which, making use of (V) to eliminate V-e and 
multiplying vectorially into the constant vector Bo both 
sides of the resulting equation, we obtain 


[o+ (8/dt) JLV?—pe(0?/d#) ](eX Bo) 
t+o[VV- (vXB)]XBo 
=u (0/dt)Lo+ €(0/dt) ](jX Bo) 
—[VV- (nv) ]XBo. (20) 


Next, to proceed with the elimination, we rewrite the 
Eulerian equation (VII) in the form 


F=f, F=p(dv/dt)+Vp, (21) 


from which, introducing the Maxwell’s electromagnetic 
stress tensor for the induced field, 


T= cLee— eS ]+u[hh— 31S], (22) 
and its (tensor) divergence 


V-T=eeV-e+(VXe) Xe] 
+yu[hV-h+(VXh)Xh], (23) 


we obtain, making use of (3), 
jXBo=F-—V-T+ye(d/dt) (eXh), (24) 
where the vector F denotes the hydrodynamic term 


F= p(dv/dt)+Vp=p(dv/dt) 
+3pVe+p(VXv)Xv+Vp. (25) 


Similarly, making use of (V) and (24), we obtain 


eXBo=o"[F-—V-T+pe(d/dt) (eXh) ] 
— (n/c) (vXBo)—(vXB)XBo, (26) 


with which we have completed the elimination of the 


electromagnetic field vectors, except for second-order & 
terms, from the vectors jXBo and eXBo which still F 


remain in (20). 

Thus, finally, substituting into (20) the expressions 
(24) and (26), we obtain the complete and exact 
magnetohydrodynamic equation in the fluid velocity, 
namely 


[o+e(d/dt) |[V?—pe(0?/dF) ] 
X{o[F—V-T+pe(d/dt)(eXh) ] 
— (n/c) (vX Bo) — (VXB) XBo} +oLVV- (vXB)]XBu 
= (0/dt)[o+e(d/dt) |[F—V-+pe(d/dt) (eXh)] 
—[VV: (nv) ]XBo, (27) 


the notable feature of which being the fact that all the 
troublesome terms which render its solution in the 
present form completely intractable appear only as 
quadratic terms. Therefore, it is suggested that in 4 
linearized theory we merely drop the unwanted second- 
order terms. However, to justify this procedure mort 
fully, we consider next the special case of infinite cor- 
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ductivity and zero displacement current which serves as 
a guide post. 


2.5 Infinite Conductivity and Zero Displacement 
Current 


It has been claimed by Walén’* that, in the limit of 


) infinite conductivity and zero displacement current, the 
) magneto-hydrodynamic equations for an incompressible 


fluid become linear. We propose to show that Walén’s 
statement is true only in the following restricted sense: 
that in this limiting case there exists a class of solutions 
(magneto-hydrodynamic waves) of the linearized equa- 
tions which also satisfy the nonlinear system except for 
an uninteresting quadratic term in the fluid velocity. To 
this end, let us examine the limiting form of (27) as 
o— and e—0. In this limit, we have merely 


= (8/dt)[p(dv/dt)+3pVe 


+p(VXv)Xv+Vp—u(VXh)Xh], (28) 


) where B=Bo+ yh and in which quadratic terms are still 


very much in evidence. Dropping these terms outright, 
however, we obtain the much simpler linear equation 


[VX VX (vX Bo) ]X Bo=up(d*v/d2)+nV(dp/dt), (29) 


which can be solved exactly. 
Thus, putting Bo>=é,By and assuming V-v=0, we 


| obtain for the left side of (29) 


[VX VX (vX Bo) ]X Bo= Be’ d’v/d22—V(dv,/dz)], (30) 
and introducing the phase velocity 
Vo= + Bo(up)-}, 


(31) 


which we will henceforth refer to as Alfvén’s phase 
velocity in honor of its discoverer,!’ we re-write (29) in 


| theform 


(0’v/d2?) — Ve? (d’v/ dF) 
= VL (uBo-*) (0p/0t) + (80./02) J. 
Taking the divergence of the vectors on both sides of 


(32) and recalling that V-v=0 has been assumed we 
obtain 


(32) 


V°L (uBo-*) (0p/dt)+ (dv,/dz) ]=0, (33) 


| i.e., the expression within the bracket must be a solution 


of Laplace’s equation everywhere. Therefore, we must 
have 


(uBo-*) (8p/dt)+ (dv,/dz) = constant, 
which allows the computation of the (excess) pressure in 
terms of the z component of velocity. Furthermore, 


substituting (34) into (32) we obtain the one-dimen- 
sional vector wave equation 


(d’v/dz*)—V ee df) =0, 


(34) 


(35) 


16 Reference 3, Eqs. (9) and (10), p. 


1H. Alfvén, Arkiv Mat. Astron. Pai 29B, No. 2 (1942). 


whose most general solution may be written as 
V(x,y,2,t) = v4.(x,y)f(2— Vat) +v_(x,y)g(2+Vat), (36) 


where f and g are arbitrary, dimensionless, single- 
valued, finite, continuous, and differentiable functions 
of their respective arguments and where v, and v_ are 
arbitrary velocity amplitude vectors independent of z 
and ¢. 

We propose to return to the infinite conductivity case 
in more detail in the sequel to this paper, Part II. Here, 
we merely wish to use (36) to solve for the induced 
magnetic field h. Thus, using the linearized form of (V), 
we obtain in the limit of infinite conductivity: 


e= —vXB,, (37) 


from which, making use of (II), we have 
p(dh/dt) = VX (vX Bo) = Bo(dv/dz), (V-v=0). (38) 


Noting that, as a consequence of the form of the solution 
(36), (dv/dt)= —V.(0v/dz), we obtain from (38) the 
important result : 

h/H)= —v/Va, (39) 


where Hy= Bo/u and V, is given by (31), the choice of 
sign depending on the direction of propagation. Equa- 
tion (39) may be rewritten, making use of (31), in the 


symmetric form 
(u)'h= = (p)4v, (40) 


which expresses the fact that, in this special case, the 
vectors v and h are everywhere parallel or antiparallel 
provided that the fluid is incompressible. Thus, finally, 
making use of (40) in the original nonlinearized wave 
equation (28), we obtain exactly 


{VX[VX (vx Bo) ]} XBo 
=p (0/dt)[p(dv/dl)+3pVP+Vp], (41) 


which differs from the linearized form (29) only in the 
presence of the quadratic term }pVv* in the bracket to 
the right, hence proving our original contention. 


2.6 Linearized Form of the Fundamental Equations 


Although the exact magneto-hydrodynamic equation 
(27) does not reduce strictly to linear form, even in the 
special limiting case considered above, the particular 
solution (39) does suggest an absolute criterion, inde- 
pendent of the conductivity, for the applicability of 
small amplitude linear theory. Thus, we need only 
assume that the fluid velocity always remains small in 
comparison with Alfvén’s phase velocity, x<V4q, in 
which case, by virtue of (39), the induced magnetic 
intensity will always remain small in comparison with 
the externally applied magnetic intensity, k<H)p. If this 
is true, then all second-order terms appearing in (27) can 
be safely neglected in an approximate linearized theory. 
Thus, neglecting uh in comparison with Bo, i.e., replacing 
B by By wherever it appears in the fundamental equa- 
tions and dropping all second-order terms from (27), we 
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obtain the linearized form of the magneto-hydrodynamic 
wave equation, namely, 


[o+e(d/dt) ][V?—pe(d2/d2) Lo F— (vx Bo) X Bo] 
=p[o+e(d/dt) ](OF/dt)—oLVV-(vXBo) ]XBo, (42) 


where F denotes here the linearized form of the hydro- 
dynamic term (25), which now reduces to 


F=p(dv/8t)-+Vp. (43) 


The important feature of Eq. (42) is that it exhibits 
the fluid velocity v as the sole dependent variable. That 
is, we have successfully eliminated the electromagnetic 
field vectors in a linearized theory. This possibility ap- 
parently had been overlooked in the literature even for 
the special case in which the electric displacement cur- 
rent is altogether neglected. Once in possession of the 
fluid velocity for a given case, as determined from (42), 
the computation of the accompanying electromagnetic 
field vectors is readily effected by making use of the 
original (linearized) Maxwellian equations. 

In the important theoretical case of infinite con- 
ductivity we readily obtain from (42), letting s—, the 
much simpler equation 


{VXLVX (VX Bo) ]+me(0?/d#) (VX Bo)} XBo 
=n(0F/dt), (44) 


which should be compared with (29). As we have seen, 
considerable simplification ensues in (42) and (44) if we 
neglect altogether the electric displacement current, as 
commonly done by most writers on the subject, which 
we can do here by merely putting e=0. However, we do 
not propose to make this approximation now, for it 
obscures some of the essential features of the resulting 
wave phenomenon, although we do intend to examine in 
the end the limiting form of the general results as one 
neglects the electric displacement current. 

Depending on the exact nature of the hydrodynamic 
term F, which appears in Eqs. (42) and (44), we 
recognize two distinct classes of linearized problems: 


I. Incompressible Fluids 


For an ideal incompressible fluid the condition of 
incompressibility demands that dp/dt=0, whence the 
equation of continuity (VIII) reduces to 


V-v=0, (45) 


which means that the velocity field must be solenoidal. 
In this case, therefore, we must seek solutions of the 
magneto-hydrodynamic wave equations (42) or (44), 
with F as in Eq. (43), subject to the divergence condi- 
tion (45). The pressure p then remains in F and, there- 
fore, must be determined in the course of solving for v 
from (42) or (44). We find later, Part II, that magneto- 
hydrodynamic waves in an incompressible fluid can be 
of two types: devoid of pressure fluctuations and ac- 
companied by a pressure wave. 
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II. Compressible Fluids (Magneto-A coustics) 


For an ideal compressible fluid, devoid of viscosity 
and expansive friction, we have in addition to the 
equation of continuity (VIII) an equation of state 
which yields the functional dependence between the 
pressure and the density. For example, for an ideal gas 
subject to adiabatic processes we have 


b/bo= (p/po)”, (46) 


where is the ratio of specific heats and fo is the pressure 
corresponding to the equilibrium density po. Quite 
generally, however, if we have available an equation of 
state between p and p, then (linearizing) 


Vp= (dp/ dp) oVp=V2Vp; V.= (dp/ dp) of, (47) 


where V, is the velocity of sound in the medium. In 
particular, if the adiabatic condition (46) holds, we 


obtain 
Vs= (vbo/po)}. (48) 


Inserting (47) into (43) and making use of the 
linearized form of the equation of continuity, 


Op/dt+po(V-v) =0, (49) 
we can eliminate the pressure and the density, obtaining 
OF /dt = po (d?v/ dé) —poV2VV Vv. (50) 


Therefore, we must now seek solutions of the magneto- 
hydrodynamic wave equations (42) or (44) after in- 
serting for OF /dt the expression on the right of (50). It is 
shown later, Part II, that magneto-hydrodynamic waves 
in a perfect compressible fluid can be of two types: 
devoid of pressure fluctuations (as in the case of 
incompressible fluids), and accompanied by a pressure 
wave (magneto-acoustic waves) of which there are two 
distinct modes. 


3. PLANE WAVES 


At the outset we take the constant externally applied 
field of magnetic induction parallel to the z-axis, i.e., 
B)=é,Bo; and we make the assumption that the pressure 
and the Cartesian components of the field vectors 
exhibit the common space-time dependence charac- 
terized by the dimensionless factor 


¥(r,t)=exp{i(k-r—wt)}, (51) 


where w is the fixed angular frequency of the time 
harmonic oscillations and k is the vector propagation 
constant, which in general turns out to be a complex 
vector. The function y/(r,/) satisfies the three-dimen- 
sional scalar Helmholtz equation 


(V+ )y=0, (52) 


where #?=k-k is in general complex and must be de- 
termined, for a particular solution, from the magneto- 
hydrodynamic wave equations (42) or (44). 

Further, since the direction of the constant vector Bo 
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constitutes an obvious axis of symmetry of the problem, 
we postulate that the vector propagation constant k, 
which can assume an arbitrary direction with respect to 
the externally applied magnetic field, can be written 
quite generally as 


k=nk=é,k,+¢,h., (53) 


where n is a unit vector in the direction of propagation 
and k, and k, represent, respectively, the transverse and 
longitudinal wave numbers. We find in Part II that, 
only in the case of infinite conductivity, does the system 
sustain plane homogeneous waves in which n is a real 
unit vector and the wave number & is also real. When 
the conductivity is finite, the resulting plane waves are 
still homogeneous, but now the wave number & is 
complex. In both cases, it is possible to set up plane 
wave solutions which are nonhomogeneous, i.e., equi- 
phase and equiamplitude planes no longer coinciding, 
which means then that n is a complex unit vector, but 
we have not found these solutions of practical interest. 

Introducing the substitutions V=ik and 0/dt= —iw, 
which are a consequence of (51), we obtain, instead of 
(43) and (45), 


= —iwpv+ipk; k-v=0, (54) 


in the case of incompressible fluids, Class I problems; 
and, instead of (50), 


—twF = —w*pov+poV 2 (k , v)k, (55) 


where V, is defined by (47), in the case of compressible 
fluids (magneto-acoustics), Class II problems. 

To deduce the linearized equation in the fluid velocity 
v that results from the elimination of the electromagnetic 
field vectors e, h, and j, we need only make similar 
substitutions in the magneto-hydrodynamic wave equa- 
tion (42), obtaining 


(we+ic) (k?—wue)[o 1 F — Bo? (vX é.) Xé, | 
= tw (we+io) F —ioB,’[é,: (kX v) ](kXé,), 


which is valid for finite conductivity. And, either from 
(44) or else letting a> in (56), we obtain 


Boi{ (k?— sue) (vX és) 
—[é,- (kXv) ]k} Xé¢,=—iopF, (57) 


which applies in the case of infinite conductivity. In 
both cases, the vector F is given by (54) or (55) de- 
pending on the class of problems being discussed. 

In either class of problems, it is clear from the 
postulate of plane waves as given by (51) and (53) that 
the elementary solutions of the vector equations (56) or 
(57) must be of one or more of the following three 
forms: 


Vi=é vo; Vo=nXvi= (nXé,)oW; Vs=ny, (58) 


where y is given by (51) and % is an arbitrary velocity 
amplitude which, according to the conditions imposed 
by a linearized theory (Sec. 2.6), must be much smaller 


(56) 
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than Alfvén’s phase velocity (31), that is, »«V«.. As 
illustrated in Fig. 1, which is drawn for a real propaga- 
tion vector k, the first two proposed solutions v; and ve 
are solenoidal, k- v=0, while the third one is irrotational, 
kXv;=0. It is clear from (54) that only the first two 
solutions v; and v2 are admissible in the case of incom- 
pressible fluids, whereas the solution vs must necessarily 
be present in the case of magneto-acoustics, at least 
whenever pressure fluctuations accompany the wave 
phenomenon. Finally, it is seen from (52) that all three 
velocity vectors (58) are linearly independent solutions 
of the three-dimensional vector Helmholtz equation, 


(V+#)v=0. (59) 


The actual selection of a particular solution (58) and the 
determination of the corresponding wave number &, 
from either (56) or (57), will be found in Part II, where 
we discuss the application of the present theory to 
incompressible and compressible fluids. 

Assuming that an appropriate particular solution of 
the vector equations (56) or (57) has been selected from 
(58), we can proceed quite generally from the linearized 
form of (1) to the computation of the electromagnetic 
field vectors e, h, and j in terms of the known velocity v. 
And, at every stage of the analysis, it proves extremely 
useful to examine the limiting form of the results as the 
conductivity becomes infinite and as we neglect the 
electric displacement current. 

Thus, making use of the linearized form of the 
Maxwellian set (I-VI) we obtain, by successive elimina- 
tions, the field vectors 


h iokX (vx Bo) 


kX(vXBo) kxXe 


R?—o"ue—iwuo a0 ayn ayn 
io{ (wye+ topo) (vx Bo) —[k- (vx Bo) Jk} 
e = 


(we+io) (R?—wpe— iwyc) 





——>—vXBo; (60) 


o—0o 


_  (we+ic) (’—w*pe) (VX Bo) —io*Lk- (vx Bo) Jk 
j= 





(we+ic) (k?—w'ne—iwyc) 
—— (i/wu){ (?— ope) (VX Bo) — [k- (VX Bo) Jk}. 


oa 


And, in case we neglect the electric displacement cur- 
rent, we can obtain from (60) the corresponding limiting 
forms by merely putting e=0. We then find that the 
forms of e and h for the case of infinite conductivity 
remain unaltered whether we retain or neglect the 
electric displacement current, but that such is not the 
case for the current density j. 


4, CYLINDRICAL WAVES 


We postulate again that the externally applied field of 
magnetic induction is parallel to the z-axis, Bo=é,Bo, 
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Fic. 1. Elementary plane wave solutions of the vector equation 
in the particle velocity. 


and we then assume that the pressure and all the field 
vectors can be deduced from the scalar function of 
position and time 


(r,t) = Ag(o)e*r-#9, (61) 


where r= o0+é,2 is the position vector of the point of 
observation and A is a constant of dimensions meter® 
sec—!. The dimensionless factor ¢(@), which is a function 
of the transverse coordinates only, is assumed to satisfy 
the two-dimensional scalar Helmholtz equation 


(Vi2+y7)o= 0, (62) 


where V?’ is the transverse part of the Laplacian opera- 
tor, V-=V2+(0/dz)?, and y is the transverse wave 
number. As a consequence of (62) the space-time func- 
tion (61) satisfies the three-dimensional scalar Helmholtz 
equation: 

(V+K*y=0, P=7+P, (63) 


where k is the longitudinal wave number. In general, + 
is chosen as a positive definite quantity whose actual 
value, for a given mode of propagation, is dictated by 
boundary conditions on a cylindrical coordinate surface 
with generators parallel to the applied field, whereas ? 
turns out to be in general complex. 

Making the substitution 0/0/=—iw, in accordance 
with (61), we have, instead of (43) and (45), 


F=—iwpv+ Vp; V-v=0, (64) 
in the case of incompressible fluids; and, instead of (50), 
iwF = pov-+poV 2VV-v, (65) 

in the case of magneto-acoustics. 

Next, resolving every vector into its transverse 
and longitudinal components, i.e... v=v:t+é,v,, V=V: 
+é,(0/dz), we deduce from (42) the form of the 
magneto-hydrodynamic wave equation which applies to 
cylindrical waves in the case of finite conductivity, 
namely 
(we+io) (V?+wne) [oF + Bev, | 


= — top (we+ioc) F+ioBelVivi—ViVi-v:], (66) 


and proceeding similarly with (44) we obtain the corre- 
sponding wave equation which abides in the case of 
infinite conductivity, 


Bel (02/d2?-+-w'ye)vi+ ViV: ‘ vi] =— dwpF. (67) 


In both cases the vector F assumes the form (64) for 
incompressible fluids or the form (65) for magneto- 
acoustics. 

To obtain the velocity field corresponding to a par- 
ticular situation we note, in complete analogy with the 
above discussion for plane waves, that the elementary 
solutions of the magneto-hydrodynamic wave equations 
(66) or (67) must be of one or more the following three 
forms'®; 

vi= VX (6.7) = WWXé,, 
Vo= (tk) "VX vi= (tk) "VX VX (6p), 
i lw Vy, 


where y is defined by (61) and & is the longitudinal 
wave number. The first two proposed solutions v, and v; 
are solenoidal, V-v=0, whereas the third one is irro- 
tational, VXv3=0. As in the case of plane waves, it is 
clear from (64) that only the solenoidal solutions v; and 
v2 are admissible in the case of incompressible fluids, 
whereas the irrotational solution vz must of necessity 
appear in the case of magneto-acoustics, at least when 
the wave phenomenon is accompanied by pressure 
fluctuations. Finally, it is seen from (63) that all three 
velocity vectors (68) are linearly independent solutions 
of the vector Helmholtz equation: 


(V+K*)v=0, K=7+F’, (69) 


where y and & are the transverse and longitudinal wave 
numbers, respectively. As a consequence of (69) it is 
readily shown from the linearized form of the Maxwellian 
set (I-VI) that the electromagnetic field vectors e, h, 
and j are themselves solutions of the vector Helmholtz 
equation with K? as in (69). 

The actual selection from (68) of a particular solution 
or a linear combination thereof for a given case, and the 
computation of the longitudinal wave number & in 
terms of a preassigned transverse wave number y are 
effected in the course of solving the magneto-hydro- 
dynamic wave equations (66) or (67). The details of 
these computations, as they apply to incompressible and 
compressible fluids, will be found in the sequel to this 
paper, Part IT. 

Once in possession of an appropriate particular solu- 
tion of the magneto-hydrodynamic wave equations (66) 
or (67), we can proceed quite generally from the 
linearized form of the Maxwellian set (I-VI), as in the 
case of plane waves, to the computation of the electro- 
magnetic field vectors e, h, and j. Making use of the fact 
that these vectors satisfy the vector Helmholtz equation 


(68) 


18 The actual proof that the velocity vectors (68), or linear 
combinations thereof, constitute elementary solutions of the 
a wave equations (66) or (67) will be given 
in Part II. 
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which are seen to agree with (60), corresponding to 
plane waves, if we merely replace V by ik and write k? 
instead of K°. 
, In case we neglect the electric displacement current, 
we need merely put e=0 in (70) to obtain the corre- 
sponding limiting forms. It is noteworthy to point out 
once more that, in the case of infinite conductivity, the 
forms for e and h remain unaltered whether we retain or 
neglect the electric displacement current, but that such 
is not the case for the current density j. 
In conclusion, the author wishes to express his sincere 
o (we+-ic) (K*—w*ue) (VX Bo) +ic? VV - (vX Bo) appreciation to Professors David S. Saxon and Leon 
ai . “aR ade Knopoff, of the Physics Department and Institute of 
eet we tape) Geophysics, respectively, for many illuminating dis- 
—— (i/wyu)[ (K?—wpe) (VX Bo) +VV- (vx Bo) ], cussions that proved extremely fruitful in the course of 
oo these studies. 


(69), we obtain by successive eliminations: 
oVX (vX Bo) a (VXBo) _ _1VXe 
oe w'pe— two o—>0 wh we 
iol (w*ye+-iwuc) (VX Bo) + VV: (vx Bo) | 
- (we+ic) (K?—w*ye— iwc) 


a a —vXBh, 
oa 








(70) 
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Particle Transport, Electric Currents, and Pressure Balance 
in a Magnetically Immobilized Plasma 


Lewi Tonxs* 
Knolls Atomic Power Laboratory, General Electric Company, Schenectady, New York 


(Received December 13, 1954) 


An analysis of a plasma immobilized by a magnetic field shows that each kind of charged particle has a 
general drift perpendicular to the gradient of the field, but that there is no corresponding electric current 
density. There is an exact cancellation arising from the gradient of the Larmor radius. The current density 
which is present arises exclusively from the particle density gradient and is not associated with any drift of 


matter. 


N an infinite, completely ionized plasma in a mag- 
netic field where the pertinent variables density and 
magnetic field are each a function of one coordinate 
only which lies perpendicular to the field, it is generally 
accepted that 


B?Y—B; =8a(n2—m)kT, (1) 


where B, and Bz are the field strengths at points where 
the total particle concentrations are m; and me, respec- 
tively, and T is the absolute temperature of the plasma. 
This equation is usually derived by applying magneto- 
hydrodynamic principles through the equations: 


Vp=jXB, 4xj=VXB, 
4nVp=(VXB) XB= (B- V)B—3VB*, 
(B-V)B=0 under symmetry assumed, 
p=nkT. 
Equation (1) should also be derivable from analysis 
based on the microscopic structure of such a plasma, 


and it is worth while to do this because the derivation 
brings to light a peculiar and possibly important 


* Knolls Atomic Power Laboratory—operated by the General 
Electric Company for the U. S. Atomic Energy Commission. 


property of the plasma. An analysis along somewhat 
similar lines but using large volume elements instead of 
small ones has already been given by Spitzer,! but added 
insight is gained by the present method. 

We adopt a local right-handed Cartesian coordinate 
system and x- and y-axes in the plane of the paper and 
magnetic field normal to it. We orient the system so 
that at the origin, which is the point of interest, the 
magnetic field, B, and plasma density, ”, are functions 
of x only, and the space variations of m and B are as- 
sumed to be small in the span of the average orbit 
diameter. We consider a volume element dxdy (being 
unity along z) at the origin. This is illustrated in Fig. 1. 

The current density at the origin will then be 


j~ (Lip Vp— Lie v.)/dxdy, (2) 


where the summations are over all positive ions and all 
electrons in the element dxdy at any instant. It will 
suffice (a) to make the detailed analysis for one kind of 
particle, and we choose ions, (b) to omit z-components 
of velocity, and (c) to neglect collisions, which lead to 
diffusion effects which may be superimposed on those 


1L, Spitzer, Jr., Astrophys. J. 116, 299 (1952). 
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Fic. 1. Ion paths, guiding centers, and volume elements in 
magnetically immobilized plasma. 


we are examining. The xy-projection of the motion of a 
typical ion is described by 


x= —a(xo) cos[w(x)f—0]+xo, (3) 
y=a(xo) sin[w(2x0)t—8]+-vp (x0) (t—to), (4) 


where a=cmv,/(eB) is the orbit radius with c, the 
velocity of light, m the particle mass, v, the speed of its 
xy-projection, ¢ its charge in esu, B the field strength 
in emu at %, where w=eB/(mc) is the Larmor angular 
velocity ; ¢ is time in seconds; 6 is a random phase angle 
over which an average will be taken ; x» and vp (t—!o) = yo 
are coordinates of the guiding center; 
cmv,2dB/dx awB’ 
vp= = (5) 
2eB? 2B 


is the drift velocity of the guiding center? due to the 
field gradient, and é is a random epoch; in the approxi- 
mation used in Eq. (4) the quantities a and w must be 
regarded as constants over the path characterized by 
the value of B at the guiding center. Then 





£=aw sin(wi—6), (6) 
Y= aw cos(wi—6)—vp. (7) 


We note that one instant of time is as good as another 
so that we may as well choose ‘=0, which we shall do. 
Thus Eqs. (3), (4), (6), and (7) become: 
£= —aw(Xo) sind, (8A,B) 
(9A,B) 

The guide centers which contribute ions to dxdy lie 
within a volume dxodyo found from Eqs. (3) and (4) by 
means of the Jacobian: 


0%9/dx=[1—(aB’/B) cos8];  9xo/dy=0 
Oy0/0x=9(—vpto)/dx=1—(aB’/B) sind; dAyo/dy=1, 


2H. Alfvén, Cosmical Elecirodynamics (Oxford University Press, 
London, 1950), paragraph 2.33, Eqs. (22) and (23). 


x=—a cosb+ x», 


y=—asind—vplo, y=aw(x) cosd+p. 


[since da/dx= (0/0x) (cmv,/eB) = — (a/B)0B/dx], so 


that 
(dxodyo)o 
(dudy)o 


to our present approximation. The subscript @ indicates 
that this relation holds for a particular 8. 

We can look upon the positive ion component j, of j 
in Eq. (2) as a resultant of currents jp,» contributed by 
different phases of the ion motions, so that 


finsio=i, 


¢ sta 
~ (dxdy)y ab 
€Vp (8) Mc, 0(dx0dyo)o 
 (dedy) 


where v,(6) is given by Eqs. (8B) and (9B) and 1.6 
is the phase density of the ion guide centers at %o, yo. 
We shall, for the time being, consider that jp, m», and 
Ngc apply to ions having »v, as their projected speed. 
Now the concentration of ions at a point is equal to the 
concentration of guiding centers there to a second-order 
approximation. 

[A procedure similar to that already embodied in 
Eqs. (2) and (11) together with a Taylor expansion of 
Ngc,6 gives that result as follows: 


YX» (1) (dxodyo)o 
a = Nge,6(%0,¥0) ———9 
dxdy dxdy 
Ne, o(X0,V0) = Ngo, (0,0) + X00Mge/dx+320°O"Mg-/Ox°+ «++, 


and by using Eq. (A) (with x=0) and Eq. (10) and 
integrating over @ one obtains 


=1—(aB’/B) cos@ (10) 





Jp,0 





, (11) 


Np ) 


a AOBOn,. a? Ong, 
Ryans aterm ome, (12) 
2B dx Ox 4 Ox 
Here dB/dx is not independent of dn,./dx because of 
Eq. (1). 
Accordingly, to the present first-order approximation 
we can write 


Nge, @(%0,¥o) =Np,6 (x0,¥o) 
and 


Np, o(X0,o) = Mp(Xo,¥o)/2m, (13) 


since the ions and guiding centers are uniformly dis- 
tributed in phase. Now 


on 
Ny(X0,Yo) =,(0,0) +a a 
x 


a ON, 
= ( {-+— — cost) (14) 


Np OX 
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PRESSURE BALANCE IN IMMOBILIZED PLASMA 


In Eq. (11) we now combine Eqs. (8B) and (9B) for 
components of v,, Eq. (10) for the relative size of 
volume elements and Eq. (14) for the concentration of 
guiding centers—all to give 


e an,’ cos0 aB’ 
te a an) 
2r Np B 


X [aw sindi:+ (aw cosd+vp)j1 |npd8, (15) 


where i, and j; are the usual unit vectors. Now, when 
we integrate over all phases from 0 to 27, the x-com- 
ponent vanishes and, noting Eq. (5), we have 


eaun,f B’ sn’ B’ 
i |-+(—-—) fi. (16) 
2 LB n B 

Here the first term in the brackets arises from vp, the 
translation of the ions in a space-varying magnetic field. 
The two parenthetical terms arise from the circular 
motion about the guiding center, and of the two, the 
first depends, of course, simply on the density gradient 
while the second, which is the more subtle of the two, 
arises from the change in size of volume element caused 
by the change in orbital radius with the magnetic field. 
It appears then, that the mass-motional contribution 
from vp is just cancelled by a quasi-static contribution 
to leave 

, ea — cmr,’ dny 
jo=—"'ji= 


(17) 
: °° @ & 


Spitzer has noted this cancellation without, however, 
calling attention to its almost paradoxical nature. The 
existence of steady mass motion is in itself consistent 
with the circumstance that we are here dealing with a 
steady state and not an equilibrium condition. In this 
the magnetic case is in contrast to the electrical analog 
even though the two have a formally close resemblance. 
For a plasma in equilibrium with no magnetic field but 
in an electric field we have, by the Boltzmann relation 


dn,/dx=n,eE/ (kT), 


dn./dx=—neE/ (kT). (18) 


Poisson’s equation gives 
dE/dx=4re(n,—n.). 


Adding the first two and eliminating (n,—m-,) with the 
third leads to 


p— po=[nptn.— (npotneo) JkT= (E?— E,?)/ (82). (19) 


This does describe an equilibrium state, and it is derived 
from equilibrium relations (Boltzmann equations). Al- 
though the language of equilibrium—“the magnetic 
pressure must balance the material pressure”—is used, 
still the magnetic analog in its hydrodynamic deriva- 
tion requires the use of a current relation, which in itself 
implies nonequilibrium. The illusion of equilibrium 
springs from the attainment of an immobile state, and 
this state is one in which there is mass transport of 
ions and electrons in opposite directions without a 
corresponding transport of charge. This transport is 
proportional to grad B; there is also a net motion of 
charge, proportional to grad m, without mass transport 
of matter. 

If, now, we apply Eq. (17) to an essentially Maxwel- 
lian distribution of velocities, $mv,? averages to kT and 
Ny and jp can now denote total ion concentration and 
total current density. Thus 


ckT dnp, 


jp=— =e 


B dx 


Also, it is evident that 


ckT dn. 


je=— “Te 
B dx 


so that using n=n.+Mp, j=JptJe, it follows that 
_ ckT dn, c dB, 
jee <—ige —— rnl 


ii ok 


and we have the familiar result of Eq. (1). 
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Principle of Detailed Balance* 


Martin J. Kien 
Department of Physics, Case Institute of Technology, Cleveland, Ohio 


(Received August 30, 1954) 


This note answers a question raised by Lloyd and Pake as to whether or not the principle of detailed 
balance applies in nonequilibrium steady states. It is shown that detailed balance holds only at equilibrium, 
and that nonequilibrium steady states must be maintained by cyclic processes. 





HE principle of detailed balance requires that 
transitions between any two states take place 
with equal frequency in either direction at equilibrium. 
This principle prevents the maintenance of equilibrium 
by means of cyclic processes. In a recent paper Lloyd 
and Pake* have raised the question as to whether or 
not detailed balance applies to nonequilibrium steady 
states. Their discussion suggested a negative answer, 
and it is the purpose of this note to discuss the problem 
further. Our conclusion is that detailed balance applies 
only at equilibrium, and that cyclic processes are 
essential for the maintenance of nonequilibrium steady 
states. 

We shall consider a very simple example, the one 
treated by Lloyd and Pake, though the nature of the 
argument is not limited to this case. We consider a 
system with three energy states €;, (i=1,2,3). Let p; 
denote the probability that the system (considered as 
one member of an ensemble) is in state i. We have, of 
course, the restriction : 

Lipi=1. (1) 


The time variation of the ; is determined by the 


equations 
dp,/dt= oj (a;ipj;— ais), (2) 


where the prime indicates that the term j7=7 is omitted 
from the summation. The quantity a,; represents the 
probability per unit time of a transition from state 7 
to state 7. We assume that our system is maintained 
at temperature T by thermal contact with a heat bath 
at this temperature. It follows* that the a,; satisfy the 
equations 


aij exp(—e,/kT) = aji exp(—¢;/RT). (3) 


In the absence of any external influence on the 
system, there is only one steady state solution of Eq. 


* This work was supported by a grant from the National Science 
Foundation. 

1 See Richard C. Tolman, The Principles of Statistical Mechanics 
(Oxford University Press, Oxford, 1938), pp. 165, 521. See also 
L. Onsager, Phys. Rev. 37, 405 (1931); 38, 2265 (1931). The 
principle is thoroughly discussed in J. S. Thomsen, Phys. Rev. 
91, 1263 (1953). 

‘. - P. Lloyd and G. E. Pake, Phys. Rev. 94, 579 (1954), Secs. 

3 This relationship. is mentioned in reference 2. For a general 
derivation, see M. J. Klein and P. H. E. Meijer, Phys. Rev. 96, 
250 (1954), Appendix. An alternate derivation can be found in 
R. T. Cox, Revs. Modern Phys. 22, 238 (1950), Sec. 2. (The 
author would like to thank Dr. J. S. Thomsen for pointing out 
this reference.) 


(2), the equilibrium state in which p°=c exp(—«/k7), 
c being a constant. Because of Eq. (3), it is evident that 
the principle of detailed balance holds in equilibrium, 
asp? =a;ip;. 

We now subject the system to an external influence 
which causes transitions between states 1 and 2, so 
that Eq. (2) becomes 


dp,/dt= Qo; (ajrpj— A151) + (po— pr) 
dp2/dt=>° j' (aj2p;— d2jp2) +b (pi— po) (4) 
dps/dt= dj (ajxpj— as;ps). 


Here 6 is the induced transition probability per unit 
time for transitions in either direction between states 
1 and 2. (In the experimental situation treated by 
Lloyd and Pake the system is a set of spins and the 
transitions are induced by an oscillating magnetic 
field of frequency. v= (€2.—«€:)/h.) 

To determine the steady state it is necessary to solve 
Eq. (4) with all dp;/dt equal to zero, using the constraint 
expressed by Eq. (1). Our interest is not in the solutions 
ps, but rather in the rates of transition which exist in 
the steady state. Carrying out the algebra we readily 
obtain the following results‘: 


Aaips) — disp = (aia +b) pi — (der +b) po 
= do3p2") — asops* = barsdae[ exp{ (€2—€1)/kT}—1]. (5) 


It is clear from Eq. (5) that detailed balance does not 
apply to the steady state considered here. This state is 
in fact maintained by a cyclic process which we may 
indicate schematically as 31—2—3, where the arrows 
indicate a net “flow.” (We assume here that e:>e1. 
otherwise all arrows are reversed.) As might be expected 
the net rate of induced transitions between states 1 and 
2, corresponds to an absorption of energy by the system 
from the external source. We have 


b(pi = po) = bai3dse[ exp{ (€2— €1)/kT} -~ 1] 
+ bae;(a3;+432)[1 —exp{ = (e2— €:)/kT} ], (6) 


where the second term is equal to do:p2)—ayop1“, 
the net rate of “natural” transitions in the reverse 
direction. 

We can understand the necessity for the cycle of 
transitions 1—~+2—+3—>1 which maintains the steady 


4 A constant factor equal to A~!, where A is the determinant of 
coefficients in Eq. (4), has been omitted from the right-hand sides 
Eqs. (5), (6). 
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state by studying the rate of entropy production in 
this state. It is clear that entropy is produced since 
the energy absorbed from the external source is con- 
verted into thermal energy of the heat bath. 

We can write, for the total rate of entropy pro- 


dp; €; 
—lo i —)’ QjiPj— AijPi) |. 7 
== — logo EE 'oxp)—-auhd)]| (0 
The first term comes from the redistribution over the 
states 7 and vanishes in the steady state. The second 
term arises from the energy delivered to the heat bath 
at temperature T. Making use of Eq. (4) and of the 
fact that all dp;/dt vanish in the steady state, we can 
write the rate of entropy production in the form 


dS/dt= b( pi — po) (€2— 1) (1/T). (8) 


5The expression for the rate of entropy production can be 
derived by the method used in M. J. Klein and P. H. E. Meijer, 
Phys. Rev. 96, 250 (1954), Sec. III. It can be proved that the 
steady state solution of Eq. (4) is the state of minimum entropy 
production as in the above reference. This point will be discussed 
further in a paper now in preparation. 
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The first factor, b(p:°°— po) (€2—€1), is the rate at 
which energy is absorbed from the external source, 
and the factor (1/7) arises from the transfer of this 
energy to the heat bath at temperature T. 

Now the rate of entropy production can also be 
expressed in terms of the energy delivered to the heat 
bath per unit time as 


dS/dt= (€2— €:) (1/T)[ (aeip2 — ar2pi) 
+ (daips“ —arsp1) J+ (€s—€2) (1/T) 
-[(dsops“ — aospo) + (asips—arspi™) ], (9) 


where we have written (€;— €:) = (€3;— €2) + (€2—«). It 
is seen from the foregoing analysis, particularly Eqs. 
(5) and (6), that these two expressions, Eqs. (8) and 
(9), agree precisely because of the failure of the principle 
of detailed balance and because of the existence of a 
cycle of transitions. 

In conclusion, we may state that the argument 
carried out above for a special example should apply 
quite generally: the principle of detailed balance 
cannot hold in nonequilibrium steady states. 
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Ultrasonic Propagation in Magnetically Cooled Helium*t 


C. E. CuAset AND MELVIN A. HERLIN 
Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts 


(Received December 14, 1954) 


Measurements have been made of the velocity and attenuation of ordinary sound in liquid helium at a 
frequency of 12.1 Mc/sec, over the temperature range from 1°K down to 0.1°K. The velocity is essentially 
independent of temperature and has the value 240-5 m/sec. The attenuation passes through two closely 
spaced maxima near 0.9°K and then falls smoothly to zero as the absolute zero is approached. These results 
agree qualitatively with the theoretical predictions of Khalatnikov. 


I. INTRODUCTION 


HE present paper describes a series of experiments 

on the propagation of ordinary (“first”) sound 

in liquid helium below 1°K. Interest in these measure- 
ments was first stimulated by the experiments per- 
formed by one of the authors'~* in the range of tem- 
peratures obtainable by pumping on the helium bath 
and by the theoretical calculations of Khalatnikov.*-§ 


*This work was supported in part by the Signal Corps; the 
Office of Scientific Research, Air Research and Development 
Command; and the Office of Naval Research. 

} Preliminary results have been reported in a recent Letter by 
the authors [Phys. Rev. 95, 565 (1954)]. 

tNow at Lincoln Laboratory, Massachusetts Institute of 
Technology. 

826 (196) ges and C. E. Chase, Proc. Phys. Soc. (London) A64, 
1). 

?C. E. Chase, Proc. Roy. Soc. (London) A220, 116 (1953). 

3]. M. Khalatnikov, J. Exptl. Theoret. Phys. (U.S.S.R.) 20, 
243 (1950). 
oe = Khalatnikov, J. Exptl. Theoret. Phys. (U.S.S.R.) 23, 8 

‘TI. M. Khalatnikov, J. Exptl. Theoret. Phys. (U.S.S.R.) 23, 
21 (1952). 


These measurements showed that the velocity of 
propagation %; levels off to an almost constant value at 
temperatures somewhat above 1°K, but that the attenu- 
ation passes through a maximum just below 1°K and 
exhibits the frequency dependence, over the range from 
2 Mc/sec to 12 Mc/sec, characteristic of a relaxation 
process. This behavior was predicted by Khalatnikov. 
It was therefore decided to extend the measurements 
into the temperature range obtainable by the adiabatic 
demagnetization of a paramagnetic salt, focusing atten- 
tion chiefly upon the attenuation. These experiments 
were performed under the saturated vapor pressure at 
a frequency of 12.1 Mc/sec, and extend from about 1°K 
to 0.1°K. 


Il. EXPERIMENTAL METHOD 
1. Demagnetization Cryostat 


The cryostat in which the experiments were per- 
formed is shown schematically in Fig. 1. This apparatus 
is essentially the same as one originally designed by Dr. 
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Fic. 1. The demagnetization cryostat: A, B, concentric cans; 
C, paramagnetic salt; D, etalon; E, F, thin-walled cones; G, 
vaseline-filled tube for leads; H, pumping "tube; I, radiation trap; 
J, holes to admit helium; K, coaxial cable. ’ Inset : L, quartz 
crystal; M, electrode; N, Bakelite insulator; O, beryllium- -copper 
spiral spring; Pr; brass case; Q, brass tube; R, reflector; S, hole to 
admit helium. 


J. Ashmead, at present in use in the Royal Society 
Mond Laboratory, Cambridge. It differs from conven- 
tional apparatus for the magnetic cooling of liquid 
helium chiefly in the method of introducing the liquid 
from the bath and subsequent isolation during demag- 
netization. The two thin-walled stainless-steel cones, 
E and F, are machined and lapped to fit together as 
closely as possible. The inner cone is sealed off full of air, 
which freezes at helium temperatures to produce a 
thermal vacuum inside the cone. This cone may be 
raised and lowered by means of a thin-walled stainless- 
steel tube extending to the cryostat cap. When the cone 
is raised, helium from the surrounding bath flows in to 
fill the experimental chamber, and the liquid filling the 
space between the cones provides almost perfect thermal 
contact for the rapid removal of the heat of magnetiza- 
tion. When the cone is seated, the heat flow is reduced 
to a very low value, and the thermal isolation is suf- 
ficient to keep the experimental chamber colder than 
the bath for more than an hour. 


Since it is not necessary to use exchange gas with 
this apparatus, tedious pumping out of the vacuum 
space is eliminated and magnetizing times can be very 
short (of the order of two or three minutes). It is there. 
fore possible to obtain many more demagnetizations 
during a run. In actual practice, precooling was facili- 
tated by introducing approximately 1 cm of air into 
the vacuum space; this freezes out, however, and 
presents no problem. Occasionally a few bursts of helium 
gas were introduced during the transfer of liquid helium 
to assist in the further cooling of the large amount oj 
paramagnetic salt present. This procedure was probably 
not strictly necessary, but it somewhat increased the 
efficiency of transfer. This exchange gas was readily 
pumped out before the bath temperature reached 1°K, 
and pressures better than 10-° mm Hg were then 
maintained for the duration of the run. 

The inner can B was filled with approximately 200 
grams of ferric ammonium alum, and the ultrasonic 
etalon D was buried in the middle of this salt to make 
certain that the two were in thermal equilibrium. Most 
of the salt was in the form of large crystals, with the 
interstices loosely filled with fine powder. This arrange- 
ment allows the helium to penetrate freely through the 
salt and makes possible the rapid establishment of 
thermal equilibrium. The leads for the ultrasonic ap- 
paratus and for a resistance thermometer (not shown 
in the figure) were brought out through three stainless- 
steel tubes G, one-eighth inch in diameter, which were 
then filled with Vaseline. The Vaseline freezes and 
prevents the flow of helium, and has a low thermal 
conductivity in the solid state. The inner can was held 
in place by means of a gold O-ring seal. 


2. The Etalon 


The ultrasonic etalon is shown in the inset in Fig. 1. 
An X-cut quartz crystal Z, 1.0 cm in diameter, was 
held against the face of the outer case P by the spring- 
loaded electrode M and insulator V. The curious shape 
of the outer case was designed to minimize direct 
propagation of the ultrasonic pulses through the sup- 
porting tube Q, which would give rise to severe clutter. 
This tube serves to align the crystal with the reflector R, 
and its faces are machined parallel to within 0.0002 inch 
at room temperature. Subsequent misalignment upon 
cooling was found to be negligible. By using different 
supporting tubes, it was easy to change the path length 
between runs. Each path was measured at room tem- 
perature with micrometers and a micrometer depth 
gauge, and a correction of 0.5 percent was made to 
allow for contraction upon cooling. This was only an 
approximate estimate, but introduced errors of neg- 
ligible importance compared with the outer sources of 
experimental error. 


3. Electronics 


The electronic apparatus is essentially the same as 
that described earlier.'* A pulsed oscillator was used 
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to drive the quartz crystal at its resonant frequency. 
The resulting sound pulses, after transmission through 
the helium and reflection at the reflector, were detected 
by the same crystal, amplified, demodulated, and dis- 
played on an oscilloscope. A DuMont type 256D A/R 
oscilloscope was used for this purpose, and provided the 
necessary synchronizing pulses as well as a calibrated 
time delay, which was used in the measurement of the 
velocity. 

For measuring the attenuation, a comparison pulse 
from a pulsed signal generator at the operating fre- 
quency was fed through a calibrated attenuator and 
into the input of the amplifier along with the received 
signal. By setting the attenuator so that both pulses 
appeared the same height on the oscilloscope, it was 
possible to determine the signal size directly, and all 
nonlinearities in the amplifiers were cancelled out. 


4. Demagnetization Technique 


Demagnetizations were performed from a starting 
temperature of about 0.9°K, obtained by means of a 
large diffusion pump, and with an initial field of 6200 
gauss. Under these conditions the lowest temperature 
reached was 0.10°K. During demagnetization the fol- 
lowing procedure was adopted. With the bath at 0.9°K 
and the inner cone raised, the magnet was turned on. 
The heat of magnetization was rapidly conducted away 
to the bath; within two or three minutes, the tem- 
perature of the inner chamber had fallen once more to 
its original value. The progress of this cooling was fol- 
lowed by means of the resistance thermometer. After 
equilibrium was reached, the cone was firmly seated, 
the magnet switched off and moved away, and the 
experimental observations were begun. 

During the initial runs with this apparatus, the 
warming rate was very high, and the cryostat warmed 
up from the lowest temperature to bath temperature in 
about ten minutes. It was later found that a large part 
of this heat leak was caused by eddy currents induced 
in the brass can by the field of the measuring coil which 
was used to determine the temperature of the salt. When 
this field was reduced, the warming time was more than 
one hour. 


5. Temperature Measurement 


The temperature of the helium bath was determined 
from the vapor pressure of the liquid by using the vapor 
pressure tables of van Dijk and Shoenberg.* A mercury 
manometer read with a cathetometer was used for 
pressures above 4 cm Hg, a McLeod gauge for lower 
pressures. A correction for the hydrostatic pressure 
head was applied at all temperatures above the ) point. 

Temperatures below 1°K were measured by means of 
an ac mutual inductance bridge operated from a stable 
33} cps source. The output of the bridge was amplified 


(1943) van Dijk and D. Shoenberg, Nature (London) 164, 151 
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by a three-stage “twin-T” amplifier with a band width 
of about one cycle, and displayed on an oscilloscope. By 
synchronizing the oscilloscope to the bridge input, it 
was possible to tell whether the unbalance of the bridge 
was resistive or reactive and in which direction the re- 
sistance or mutual inductance must be varied, thus 
making it possible to balance the bridge very rapidly. 
Measurements could be made to within about 2 yh, 
corresponding to a temperature error of 0.002°K at 
1°K. The errors in the absolute temperature are prob- 
ably greater than this, however, because of errors in the 
calibration region above 1°K where the sensitivity was 
much smaller and because of a tendency for the cali- 
bration to drift slowly during the course of a run. During 
one run this drift amounted to 0.05°K at 1°K, over a 
period of several hours. 

The mutual inductance readings in the region above 
1°K were fitted to a Curie law by the method of least 
squares. At the lowest temperatures the departures from 
Curie’s law become significant; these were determined 
from the measurements of Kurti, Laine, and Simon’ 
on the susceptibility of iron ammonium alum. The cor- 
rection amounted to 0.015°K at 0.10°K. In order to 
determine this correction, the demagnetizing factor of 
the sample had to be calculated. For this purpose, the 
salt was approximated by an ellipsoid of major and 
minor axes 6.5 cm and 2.5 cm, respectively, and the 
demagnetizing factor calculated from formulas given 
by Garrett.® It turned out, however, that both the 
present sample and that used by Kurti, ef al., were suf- 
ficiently close to a long cylinder so that this was of 
small importance at temperatures down to 0.1°K. 


III. SOURCES OF ERROR 
1. Temperature Errors 


The chief sources of temperature errors were men- 
tioned in the preceding section: errors in the mutual 
inductance calibration and drift of this calibration, pre- 
sumably the result of the effect of drift of room tem- 
perature on the balancing coils. The first of these errors 
would be most important at low temperatures where the 
extrapolation of the calibration curve is greatest ; how- 
ever, from the good agreement between points obtained 
in separate runs with independent calibrations it seems 
unlikely that this error is greater than perhaps 0.01°K. 
The second error appeared to be a constant rate of drift 
of a few microhenries per hour, and changed the 
readings near 1°K by a few hundredths of a degree, 
while affecting the lowest temperatures very little. It 
was found that if the first and last of a series of demag- 
netizations were adjusted so that the attenuation 
agreed with its known value above 1°K and a linear 
rate of drift was assumed, the points for intermediate 
demagnetizations agreed well with the rest. This pro- 


7 Kurti, Laine, and Simon, Compt. rend. 204, 675 (1937). 
8C. G. B. Garrett, Magnetic Cooling (Harvard University 
Press, Cambridge, 1954). 
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Fic. 2. Effect of warming rate on thermal equilibrium between salt 
and helium. The numbers are serial numbers of demagnetizations. 


cedure was therefore adopted for determining the tem- 
peratures in those runs shown as black dots in Fig. 3. 
In the earlier runs (shown as open circles) this effect 
was unimportant, since the runs were of shorter dura- 
tion and the measurements did not extend to sufficiently 
high temperatures for the error to be appreciable. 


2. Thermal Equilibrium Between Salt and Helium 


It has often been observed that in demagnetization 


experiments where the warmup time is very short, 
thermal equilibrium between the salt and helium is not 
attained. This question must therefore be carefully 
considered if any weight is to be attached to measure- 
ments made within the first few minutes after demag- 
netization. In the present work a number of the meas- 
urements were made with total warmup times of the 
order of ten minutes, and it was observed that these 
results were completely self-consistent and agreed well 
with later measurements made with warmup times of 
about an hour as long as the warming rate was less 
than a certain critical amount. For faster warming 
rates, however, a considerable shift in the attenuation 
curves in the direction of lower magnetic temperature 
was observed. This can be explained by the assumption 
that the helium is not actually as cold as the salt, which 
is serving both as coolant and thermometer. This effect 
is illustrated in Fig. 2. In this diagram the magnetic 
temperature corresponding to an attenuation of 0.10 
cm is plotted against the time after demagnetization 
at which this value was reached. The attenuation of 
sound thus serves as a thermometer to determine the 
temperature of the helium, while the magnetic measure- 
ments are used to determine the temperature of the 
salt. It will be seen that for times less than about 70 sec 
after demagnetization the helium is definitely warmer 
than the salt, but that equilibrium is rapidly reached 
thereafter. All those points obtained with insufficiently 


long warmup times have been omitted from the curves 
in the following section. 


3. Velocity Errors 


Velocity measurements made with a fixed path length 
are necessarily subject to certain errors because of the 
necessity of measuring the absolute magnitude of the 
path length and estimating the amount of its contrac. 
tion upon cooling. An even larger error may result from 
the assumption that this figure corresponds to the 
acoustical path length, for it has been found that the 
processes of reflection and detection by quartz crystals 
lead to spurious extra delays which may be of the order 
of a few microseconds. This effect is not understood at 
present but presumably results from the relatively high 
Q of the crystals, which require several cycles to start 
oscillating. A third source of error lies in the difficulty 
of estimating the positions of the feet of the input pulse 
and received signal as the attenuation varies. This may 
have introduced errors of 2 usec or 3 sec in the present 
measurements. As a consequence of all these sources of 
error, the velocity measurements are considered to be 
accurate only to within +2 percent. 


4. Attenuation Errors 


The largest single source of error in the attenuation 
may come from the fact that the absolute value cannot 
be found from measurements with a single path length, 
and the results had therefore to be fitted to the known 
attenuation above 1°. However, the accuracy of this 
fitting procedure is confirmed by the fact that results 
obtained with two different path lengths agree well with 
one another, and indirectly by the fact that the resulting 
attenuation curve falls smoothly to zero at the absolute 
zero. A further source of error arose from possible drift 
of the size of the input pulse during a run; this was 
eliminated as far as possible by checking the signal size 
against the known attenuation at bath temperature at 
the end of every demagnetization. With one exception 
this drift was negligible; in that instance it amounted 
to 0.04 cm, and the results were consistent with the 
other runs when a correction of this amount was applied. 
The largest remaining source of error was the calibration 
of the attenuator on the signal generator, which was 
accurate only to a few percent. As a result of all these 
sources of error, it is estimated that the attenuation 
measurements are accurate to within +0.05 cm“. 


IV. RESULTS 
1. The Velocity 


Because of the inherent difficulties in velocity meas- 
urements discussed in the preceding section and the 
fact that no appreciable changes occurred below 1°K, 
only a few observations were made. From the results of 
five different demagnetizations, with path lengths of 
3.94 cm and 1.96 cm, it was concluded that the velocity 
at 0.1°K is equal to 240+5 m/sec. This is in agreement 
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with the earlier estimate? of 239+2 m/sec obtained 
from extrapolation of the results at higher tempera- 
tures. Although the measured value is less accurate than 
the extrapolated one, it excludes the possibility of any 
gross changes in the velocity in this range and confirms 
the relationship between the velocities of first and second 
sound given by Landau’? and discussed earlier by one of 
the authors.'? Measurements in the region between 
0.7°K and 1.1°K were even more difficult because of the 
high attenuation, and were only possible with the 
shorter of the path lengths given above. It was found, 
however, that no appreciable variations occur in the 
neighborhood of the maximum attenuation. 


2. The Attenuation 


The results of the attenuation measurements are 
shown in Fig. 3. The open circles represent those points 
taken with a path length of 3.94 cm. With this path 
length the signal became comparable with noise near 
0.7°K, and measurements in the neighborhood of the 
maximum attenuation were impossible. A shorter path 
length of 1.96 cm was accordingly used to investigate 
this region, and the results are shown as solid circles. 
The full curve reproduces the earlier results of one of 
the authors.? The points shown are only representative 
of all those taken in a total of 35 demagnetizations with 
the longer path and 10 with the shorter. Since only the 
relative attenuation was determined, the results have 
been adjusted to agree with the earlier results above 
1.2°K. In all of those runs with the longer path length, 
the warmup time was only about ten minutes to bath 
temperature, and observations were possible for about 
four minutes before the attenuation became unmeas- 
urably high. In the later runs the warmup time was 
over an hour during three demagnetizations from a 
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Fic. 3. Attenuation of sound in liquid helium at 12.1 Mc/sec. 
*L. Landau, J..Phys. (U.S.S.R.) 5, 71 (1941). 
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Fic. 4. Attenuation of sound in liquid helium at 12.1 Mc/sec. 
The slope of the dotted line is 2.8. 


bath temperature of 0.9°K, and only about 10 minutes 
during several other runs starting from 1.2°K. 

The most significant feature of Fig. 3 is the presence 
of a double maximum in the attenuation around 0.9°K. 
This double maximum was not observed during the 
earlier measurements, apparently because the relatively 
few points taken at these temperatures happened to 
fall on the smooth curve shown in the diagram and the 
minimum was completely missed. The earlier results at 
2.0 Mc/sec and 6.0 Mc/sec are presumably similarly 
in error. The existence of these twin maxima provides 
direct confirmation for the prediction of Khalatnikov*-> 
that the attenuation of sound in helium 11 is determined 
by two relaxation times. The relation of these measure- 
ments to the theory will be discussed in more detail in 
the following section. 

Another important fact is that the attenuation falls 
smoothly to zero as the absolute zero is approached. 
Such behavior is to be expected on the basis of any 
theory that explains the attenuation of sound in terms 
of the elementary excitations of the liquid, since the 
density of these excitations falls to zero as the absolute 
zero is approached. 

In Fig. 4 the results below 0.8°K are reproduced on a 
logarithmic plot. From this diagram another effect may 
be seen: above 0.3°K the attenuation is quite accurately 
proportional to T?-*; at lower temperatures the variation 
is more rapid. It does not appear to be possible to fit 
the entire curve to any sort of exponential or other 
simple relation, and therefore it seems probable that 
this abrupt change near 0.3°K represents some sort of 
change in the fundamental processes giving rise to 
attenuation of sound. It is perhaps significant that this 
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occurs just in the region where the phonon mean free 
path is becoming comparable with the dimensions of 
the apparatus, and where the velocity of second sound 
starts to rise from approximately u/V3 to 1.10" 


V. KHALATNIKOV THEORY 


In his derivation of the temperature variation of the 
attenuation of first sound in liquid helium,*~* Khalat- 
nikov assumes that the mechanism responsible for the 
attenuation is inelastic collisions of the elementary 
excitations, that is, processes in which the numbers of 
phonons and rotons are changed. From his calculations 
it turns out that there are two relaxation times govern- 
ing the attenuation, corresponding to the two dominant 
processes: phonon-phonon and phonon-roton interac- 
tions. The presence of two maxima in the attenuation 
vs temperature curve suggests that this is indeed the 
case. However, an exact comparison of theory with 
experiment is complicated by the fact that elastic col- 
lision processes become important at temperatures of 
the order of 0.9°K. These processes are the ones which, 
according to Landau and Khalatnikov,”:" are respon- 
sible for the viscosity of the liquid. At temperatures 
above 1°K the viscous relaxation times are so short that 
the viscosity at 12 Mc/sec is the same as the viscosity 
at zero frequency, and the viscous attenuation can be 
found from the usual Stokes formula. This part of the 
attenuation is independent of that calculated by the 
Khalatnikov formulas, and the two contributions must 
be added to obtain the experimentally observed value. 
Below 1°K, however, the viscous relaxation times 
become comparable with the time of one cycle of the 
sound wave, and the Stokes formula may no longer be 
used to calculate the viscosity attenuation. In this 
region the concepts of first and second viscosity no 
longer apply, and all the relaxation times must be con- 
sidered explicitly as contributing to the absorption. 
Since the maximum viscous attenuation is believed to 
occur just in the region where the attenuation from 
‘inelastic scattering processes is greatest, the situation is 

1 de Klerk, Hudson, and Pellam, Phys. Rev. 89, 326 (1953). 

11 V. Mayper and M. A. Herlin, Phys. Rev. 89, 523 (1953). 

12. Landau and I. M. Khalatnikov, J. Exptl. Theoret. Phys. 
(U.S.S.R.) 19, 637 (1949). 


13. Landau and I. M. Khalatnikov, J. Exptl. Theoret. Phys. 
(U.S.S.R.) 19, 709 (1949). 
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extremely complicated, and a quantitative calculation 
does not appear to be feasible. 

Fortunately, it is possible to reach certain qualitative 
conclusions without the aid of a complete theory. At 
any temperature, the viscous attenuation will certainly 
be no greater than that calculated from the low. 
frequency viscosity by the Stokes formula, since the 
effect of the finite viscous relaxation time is to make the 
viscous attenuation pass through a maximum and 
eventually fall to zero. At 0.9°K and 12.1 Mc/sec this 
is less than 20 percent of the total attenuation, so that 
the errors involved in neglecting it altogether are not 
large. The constants of the Khalatnikov theory may 
thus be determined approximately. If the two maxima 
in the attenuation are associated with the phonon- 
phonon and phonon-roton relaxation times, the con- 
stants a and 6 occurring in Eqs. (2) and (3) of reference 
6 may be evaluated from the positions of the maxima 
independently of the behavior above 1°K. An estimate 
of their values gives 1/a~2X10-“, 1/b~0.8X10-". 
This value of 1/a is approximately twice as large as 
the value found earlier? from the data at higher tem- 
peratures, and would make the agreement in that range 
considerably worse. The exact shape of the curve is, 
however, extremely sensitive to the value of A, one of 
the parameters in Landau’s energy-momentum relation 
for rotons’ and this parameter is not known with 
any degree of accuracy. In fact, it now seems likely that 
sound attenuation measurements provide the most 
accurate method of determining its value, if the effects 
of relaxation in the viscous processes can be calculated. 
An approximate estimate suggests that a value of 
A=9.2°K would provide good agreement between 
theory and experiment both in the relaxation regions 
and at higher temperatures. 
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Adsorption of Mixtures of He* and He‘{ 
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The distribution of He*® between the vapor phase and dilute mixtures of He® in He‘ adsorbed on jewellers’ 
rouge (Fe2:03) has been measured between 1.6°K and 2.3°K, for saturations in the range 15 percent to 99 
percent, corresponding to film thicknesses of 1.3 to 34 layers. The ratio of the mole fraction of He® in the 
vapor phase to that in the adsorbed phase is practically independent of the thickness of the asdorbed film, 
and essentially the same as for the bulk liquid case. The onset of superfluidity in the adsorbed films does not 
influence the He® concentration in samples taken by simple desorption techniques. 





NUMBER of theories on the formation of the 

He II films have discussed it as a consequence 
of the Bose-Einstein cgndensation.! The isotope He’ 
obeys of course Fermi-Dirac statistics. For the case of 
a mixture of He® and Het, it has therefore been pre- 
dicted? that an unusually selective adsorption of Het 
should occur if a Bose-Einstein condensation mechanism 
is involved in the film formation. 

However, it is known that many of the properties 
of liquid He‘, of liquid He’®, and of the Het‘ film are 
very little influenced by the statistics.* The heat of 
vaporization is only slightly affected by the A 
transition, and the change in slope of the vapor 
pressure curve in the region of the A point is almost 
imperceptible, as was shown by Keesom.® Similarly, 
the adsorption isotherms of He‘ show little of the 
influence of the \ phenomenon.*® 

The excellent theoretical predictions of deBoer and 
Lunbeck’ for the vapor pressure of pure He* were 
deduced without consideration of the special statistics. 

The transport properties of He* are of course 
enormously changed in the transition; the theory of 
Feynman® shows the relationship of the statistics and 
of the Bose-Einstein condensation to these properties 
in condensed Het. 

Considerable attention has been paid to the study of 
mixtures of He® and He’, in particular the distribution 
of He® between the liquid and the vapor phase in 
dilute solutions.® In this work, extreme precautions 
had to be taken to ensure that film flow and consequent 


t Supported in part by grants from the National Science 
oundation. 

1 Bijl, de Boer, and Michels, Physica 8, 655 (1941); O. Halpern, 
Phys. Rev. 87, 520 (1952) for earlier literature. 

*W. Band, J. Chem. Phys. 19, 435-48 (1951). 

*See the recent review by J. G. Daunt and R. S. Smith, Revs. 
Modern Phys. 26, 187 (1954) for a discussion of this point. 

‘See W. H. Keesom, Helium (Elsevier Publications, New York, 
1942), p. 232. 

® Reference 4, p. 192. 

°For a summary of the adsorption data, see E. Long and L. 
Meyer, Advances in Phys. 2, 1-11 (1953). 

7]. de Boer and R. J. Lunbeck, Physics 15, 510 (1948). 

*R. P. Feynman, Phys. Rev, 91, 1291, 1301 (1953); Phys. 
Rev. 94, 262 (1954). 
(osre the review by J. G. Daunt, Advances in Phys. 1, 209 


preferential evaporation of He‘ did not falsify the 
samples taken from the system. It was found that the 
slightest disturbance of equilibrium during the sampling 
caused the superfluid component of the He* to com- 
pensate the disturbance at a rate much faster than that 
of the He’, thus yielding samples highly enriched in 
He?,10 

It seemed of interest to investigate the distribution 
of He* between the vapor and adsorbed phases for the 
case of the unsaturated helium film, using very dilute 
mixtures of He* in He‘. In this case, the perturbing 
influence of film flow should be minimized, due to the 
thinness of the layers involved, and particularly to the 
very large ratio of evaporative surface to the mass of 
“liquid”, as compared to the system bulk liquid-vapor. 
Also, the variation of the distribution ratio with the 
thickness of the film should provide at least a crude 
test of the role of the statistics in the adsorption 
process. 

Accordingly, it was decided to measure the distri- 
bution of He* between vapor and adsorbed phases as 
a function of temperature and film coverage, using as 
adsorbent jewellers’ rouge (Fe203), for which accurate 
adsorption data were available in this Laboratory. 

The method used is quite simple, more or less 
standard adsorption technique, with no precautions 
being taken to overcome a potentially perturbing 
influence of the film flow. The apparatus is shown 
schematically in Fig. 1. It consists of a copper adsorp- 
tion chamber of volume 2.63 cm’, containing 4.694 g 
of jewellers’ rouge, of total surface area 100+3 m?, 
mounted in a helium cryostat, and connected to a glass 
capillary manifold by a stainless steel capillary 1 mm 
id. The manifold leads to a high-vacuum system, a 
calibrated Toepler pump, a manometer, a storage bulb, 
and break-seal-type sample tubes arranged so that 
each tube could be sealed off after filling for later 
analysis of its contents, using a mass spectrometer. 

The original material, containing ~4 percent He? in 
He‘, was obtained from the Stable Isotopes Division 
of the U. S. Atomic Energy Commission. It was first 


10See reference 9 and Lane, Fairbank, Aldrich, and Nier, 
Phys. Rev. 75, 46 (1949). 
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Fic. 1. Schematic diagram of adsorption apparatus. 
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purified by passing it slowly through a glass wool-filled 
U-tube immersed in liquid helium, in order to remove 
impurities (especially tritium); it was then diluted to 
a mixture of concentration about 1 percent He’, and 
was transferred to the storage bulb. 

The experimental procedure was as follows: Known 
amounts of the mixture were admitted to the adsorption 
chamber and its connected capillary tubing by means 
of the calibrated Toepler pump, at each temperature 
of measurement. After adsorption equilibrium was 
established, and the adsorption pressure was read on 
the capillary manometer, using a Gaertner sliding 
microscope, measured amounts of gas were desorbed 
into the Toepler pump. Then desorption equilibrium 
was established, the pressure was read again, and the 
amount of gas collected was transferred into one of the 
sample tubes for later mass-spectrometric analysis. 
This process was repeated six to eight times. The 
cryostat was then allowed to warm to room tempera- 
ture, and the residual gas emerging from the adsorption 
bulb system was collected in the Toepler pump. The 
quantity of gas was measured, to obtain a material 
balance for the series of measurements, and the residual 
gas was then pumped into a sample tube for analysis. 

The temperature of the bath surrounding the 
adsorption bulb was determined by measuring the 
helium vapor pressure with manometers filled with 
Octoil S, using a Wild cathetometer. The temperature 
of the bath was kept to within a millidegree or less of a 
set value during the measurements. 

The isotopic composition of the samples was deter- 
mined with a 12-in. radius of curvature single-focusing 
mass spectrometer equipped with an electron multiplier. 
The resolution of the machine was one part in 2000, 
so that the He*® peak was completely resolved from the 
HD peak at the same mass number. Since the method 
of calculation used was such that the isotopic discrimi- 
nation of the mass spectrometer cancels out, an absolute 
calibration of the machine was unnecessary. 

The results of a typical run are shown in Table I. 
This run covered the region of high saturations at 
1.801°K; thus the film should show superfluid be- 


havior.“ Column I gives the history of the gas trans. 
ferred to a given sample tube. The number » of adsorbed 
layers given in column II was calculated from the 
quantity adsorbed in the following way: From the 
total amount adsorbed (after deadspace corrections 
had been made) the amount v, adsorbed in the first 
layer [as derived from the BET (Brunauer, Emmet, 
and Teller) theory in the conventional manner] was 
deducted, another 3v,, was deducted for the second 
adsorbed layer, and the remainder was divided by 
0.28 cm® per m? (see reference 6). Column III gives the 
total quantity of gas taken off in each desorption. In 
column IV is given the mole fraction of He® in the 
vapor phase, expressed in percent of the sample taken. 
The amount of He’ taken out in successive desorptions, 
as shown in column V, is derived from the values given 
in III and IV. The amount of He’ in the dead-space is 
calculated from the known volume of the dead-space, 
the temperature and pressure, and from ¢,, corrected 
from a plot of c, vs the amount desorbed, taking into 
account the fact that the value in column IV is the 
average c, during a single desorption, whereas the He 
concentration in the dead-space is only that value at 


TaBLeE I. Characteristic measurements of He®/He* distribution 
between vapor and absorbate. Run 6-10-53. T=1.801°K. 








Mole 
Amount fraction 
No. of le- i 
Sample layers sorbed 
content » cm? 





Original! 
mixture 
1st desorp. 
2nd desorp. 
3rd desorp. 
4th desorp. 
5th desorp. 
6th desorp. 
7th desorp. 
residues 
300°K 


ede OURR: 
NNN’ 


ANAKAMAD, 
S100 00 


eee tee s 
AQaANTORWOa. 


td 
an 
N 
> 








the end of this desorption; therefore the value is in 
rough approximation the average between c, for this 
particular desorption and that of the following one. 

The difference between (a) the total amount of He’ 
initially put into the system and (b) the amount 
removed by desorption plus that in the dead-space 
gives the quantity of He’ still adsorbed in the film. 
In a similar manner, the difference between (c) the 
total amount of gas supplied to the system, and (d) 
the amount removed by desorption plus that in the 
dead-space represents the quantity of helium stil 
adsorbed. The quotient of these two values yields ¢, 
the mole fraction of He® in the adsorbate, shown in 
column VII, in percent. Column VIII shows the ratio 
C,/Ca, the distribution of He® between vapor and 
condensed phase. 

It is seen from Table I that the distribution ratio 
Cy/Ca is constant between coverages of nine to sevel 


1 See E. Long and L. Meyer, reference 6 for data and discussion 
on the onset of superfluidity in adsorbed films of He‘. 
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adsorbed layers; in addition, analysis of the residual 
seven layers yielded an averaged c,/ca in agreement 
with those determined in the previous desorptions, 
which had involved only fractions of a layer. This 
result justifies the use of the total quantity of adsorbed 
He’ divided by the total quantity of the adsorbed 
mixture as Cq.!” 

A number of such determinations were made, at 
temperatures from 2.3°K to 1.6°K, and over a fairly 
wide range of film coverage. Within the experimental 
accuracy of about 10 percent, the ratio c,/ca was 
found to be independent of the film thickness. 

In fact, that ¢,/ca is not dependent on the film 
thickness permits a rather simple summary to be made 
of the data. This is shown in Table II, which gives the 
average value of c,/ca for the range of film coverage 
measured at each temperature. 

The nondependence of ¢,/ca on film thickness found 
in these experiments is subject to the criticism that 
in the desorption technique used, not only the film 
thickness is changed, but the concentration of He’ is 
radically changed in the desorption process. In order 
to check this point, several runs were made, especially 
at 1.8°K, starting with radically different film thick- 


TaBLE IT. Summary of He®/He‘ distribution between 
vapor and adsorbate. 
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nesses, but with the same He*-He‘ mixture. The 
resulting ¢,/¢a values were always the same, within 
the experimental error, and are included in the summary 
of Table IT. 

Thus no anomalous adsorption of either He* or He! 
has been found, not even in the first statistical layer, 
which in the BET treatment is very highly compressed 
and exhibits solid-like behavior." 

In the manner in which these experiments were 
performed, superfluidity in the adsorbed layers should 
occur over a range of temperatures below 2.19°K, 
depending on the thickness of the film (see reference 
13). The ratio c,/ca should therefore be affected by 
superflow, giving preferential evaporation of the He’. 

he independence of ¢,/ca on film thickness at the 
emperatures shown in Table II indicates that these 
easurements are relatively free of this effect, due 
presumably to the thinness of the film and to the 
fnormously high surface available for evaporation, as 
ompared to the bulk liquid case. 

“The thermodynamic equilibrium between vapor and film is 
uetermined by the differential value of ca, which a priori could 


ot be expected to be equal to the average value used above. 
" See reference 6, p. 12. 
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Fic. 2. Values of ¢»/¢a and ¢v/c; (distribution of He? between 
vapor and condensed phases) as a function of temperature. 





In order to make a more rigorous test of this con- 
clusion, a small adsorption bulb, used previously" for 
the first adsorption measurements on helium in this 
laboratory, was substituted for the adsorption bulb of 
Fig. 1. This system had only 1.7 m? adsorbing surface 
as compared to 100 m? in the measurements summarized 
in Table II. A series of measurements was made at 
1.801°K, in the range from 34 adsorbed layers (over 
99 percent saturation) to about four layers. At this 
temperature, the onset of superfluidity should occur 
at about seven layers, corresponding to ~93 percent 
saturation. 

The same technique was used as before; however, 
due to the smaller surface area, the rate of desorption 
was increased by a factor of over fifty. The resulting 
values of ¢c,/ca, which were necessarily less accurate, 
ranged from 11 to 16, again independent of the film 
thickness and of the He® concentration, which varied 
between 0.2 percent and 0.003 percent in the adsorbed 
phase. This must be considered as being consistent 
with the values summarized in Tables I and II. 

The reason that the superflow does not materially 
affect the sampling is probably due to the fact that 
even in this unfavorable case 1.7 m? of surface are 
available for establishing vapor-condensate equilibrium, 
whereas in sampling from bulk liquid the effective 
surface is only a few cm’; the film flow is therefore 
inadequate to overcome a factor of 104. 

It is of interest to compare these data with the 
values of c,/c: for vapor-liquid mixtures. The most 
recent data are those of Sommers,!* and of Daunt and 
Heer.!” 

It is unfortunately not possible to make an adequate 
comparison of the data with those of Daunt and Heer. 
Due to discrepancies in the analyses of their isotopic 
mixture, they were forced to express their results in 

44 E. Long and L. Meyer, Phys. Rev. 76, 440 (1949). 

18 The possibility of capillary condensation at the higher 
saturations cannot be ruled out in this series of measurements. 
However, no effect on the ¢»/ca ratios was observed. 


16H. S. Sommers, Jr., Phys. Rev. 88, 123 (1952). 
17J. G. Daunt and C. V. Heer, Phys. Rev. 86, 207 (1952). 
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the form of ranges of c,/c;. The data of this investi- 
gation are in agreement with the highest values given 
by them, at each temperature. 

Figure 2 shows a comparison of the c,/ca data with 
the distribution ratio ¢c,/c,; derived from the values 
for a 0.58 percent mixture given by Sommers. The 
solid curve represents Sommers data. The vertical 
lines represent the ratios measured in this investi- 
gation; their length indicates the spread of individual 
results. 

The agreement is considered to be quite good, 


INGHRAM, LONG, AND MEYER 


are beyond the experimental error. However, the 
measurements of Sommers did not extend to tempera. [is 
tures higher than 2.18°K, and the value at 2.35°K 
reported here is within the range of values reported 
by Daunt and Heer. Furthermore, the difference 
does not appear to be really significant. Interaction 
between the adsorbent and the adsorbed phase, re. 
sponsible for the adsorption process, must be expected 
to have a slight influence on the He® distributionfs 
between vapor and condensate, since the heat of 
adsorption will be different from the heat of vapori- 


except at the higher temperatures, where the deviations zation of the bulk liquid. 
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Some recent papers on condensation were based in part on the conjecture that the probability function 
W (N;) for the number of particles N in a finite volume at given temperature and chemical potential » per 
particle has two maxima for a finite range of these variables. We have investigated the validity of this 
conjecture for a finite version (square of N sites with toroidal connection) of the two-dimensional lattice 
gas of Lee and Yang. Considering at first only the values Ni¢3 and N; 2 N—3, we show that W(N;) has 
at least two maxima for values of yu in the neighborhood of the transition value at temperatures smaller than 
or approximately equal to 3.57./InN where 7, is the critical temperature of the infinite model, while the 
upper points of the first three saltus of the most probable density of the finite model approximate the density 
of the infinite model-in the gas region with a relative error of order N-+. (An analogous result holds by 
symmetry for the liquid region.) To extend these results to a larger range of numbers NV we consider a 
histogram obtained from W(N;:) by summation over relatively narrow groups of numbers in the range 
Nix, where n is an arbitrary integer < N+. We show that at sufficiently low temperatures this histogram 
has a maximum centered on Np, where p» is the mth partial sum of the Mayer series for the density of the 
infinite model in powers of the fugacity. The finite model thus provides a physical interpretation for the 
extrapolation of the density (by means of a partial sum of the Mayer series) beyond the transition value 


of the fugacity. 


N a previous paper! we have derived some conse- 
quences of the assumption that a real fluid .can be 
considered as a “binary alloy” (rather than an en- 
semble?) of submacroscopic system (cells) whose parti- 
tion function is of the van der Waals type. By this we 
mean that the grand canonical probability function for 
the number of particles in a small cell has two sharp 
maxima for a certain range of values of the temperature 
and the Gibbs free energy, so that the relatively most 
probable density as a function of the Gibbs free energy 
per atom resembles the rising branches of the S-shaped 
curve which one obtains from the van der Waals 
equation.* If submacroscopic volumes of a real fluid 
have this property, the cells can be considered as 


* Guggenheim Fellow on leave from Northwestern University. 

7 Present mail address: Physics Department, Northwestern 
University, Evanston, Illinois. 

1A. J. Siegert, Phys. Rev. 96, 243 (1954). 

2S. Katsura and H. Fujita, Progr. Theoret. Phys. (Japan) 5, 
997 (1950). 

3 The descending branch corresponds to a probability minimum. 


objects which can assume only two states, and if this 
property persists to volumes large enough so that the 
interaction between particles in different cells can be 
replaced by an interaction between adjacent cells, 
depending only on their states, the total system be- 
comes an “alloy” or Ising model of cells. We showed 
in the preceding paper’ that under these assumptions, 
and for plausible values of the interaction energy 
between cells, condensation occurs as phase transitiol 


of the Ising model, while the thermodynamic functions, 
of the total system are otherwise still essentially ice 


determined by the stable branches of the corresponding 
functions of the individual cell. 

It seems thus an important question whether thei 
relatively most probable density of a system of finite 
volume is, for a certain range of the temperature andj 
the Gibbs free energy, a two-valued function of them 
Gibbs free energy per atom and approaches the disco ji 
tinuous form, expected to occur for infinite volume™ 
by contraction of the region of two-valuedness, 1 
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METASTABLE STATES OF FINITE LATTICE GAS 


whether it approaches the discontinuous form by the 
steepening of a single-valued, monotonic function.‘ 
he empirical fact of the existence of metastable states 
(undercooled gas and overheated liquid) suggests the 
first possibility. 

This question could become of practical importance 
if the branches of the most probable density of the 
finite system were to approach the density of the infinite 
system in the stable regions rapidly with increasing 
yolume, and to approximate them for cell volumes 
apable of containing only a number of particles small 
enough to be handled by machine calculations.® 

One must not expect that the existence of metastable 
states of the subsystems is a necessary or sufficient 
ondition for phase transitions in general. Van Hove 
has found a model which exhibits a phase transition 
without having metastable states in its finite form,® 
and it is well known that an appropriate interaction 
between subsystems is necessary for a phase transition. 

We, therefore, thought it of interest to study the 
question of metastable states for the finite version of 
he two-dimensional lattice gas of Lee and Yang’ since 

e infinite, two-dimensional lattice gas is the only 
realistic model for which some exact results are known. 
Since we are interested only in the existence and some 
simple properties of the metastable states for this model 
we have in Sec. 2 limited these considerations to less 
han four particles in the gas and less than four empty 
sites in the liquid, which limits us to temperatures 
hich are well below the critical temperature, but as 
ve will show, not absurdly low. The results of this 
section have a direct bearing on our previous paper! in 
hat they show that the conjecture made there holds 
or a finite but large range of larger cell volumes, if it 

true for some value of the cell volume. In Sec. 3 we 
have given a generalization to a wider range of numbers 
of particles or empty cells. 


2. 


We consider a lattice gas in a “volume” consisting 
pf a square of N cells in a square lattice. Opposite 
edges of the square are considered as adjacent to avoid 


‘edge effects in the counting of configurations.? The 
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‘The expectation value of the density is an increasing function 
bf the Gibbs free energy, and can thus approach the discontinuous 
miting form only in this way; see C. N. Yang and T. D. Lee, 
Phys. Rev. 87, 404 (1952), Appendix III. 

®It may be advantageous to use as the finite system a cell 
which is connected to itself by identification of opposite faces; 
bee references 8 and 13. 

°L. van Hove (private communication). 

™T. D. Lee and C. N. Yang, Phys. Rev. 87, 410 (1952). 

*Tf an infinite lattice gas is to be considered as built up of such 
subsystems the removal of interaction between particles on 
bpposite edges must be taken into account in the interaction 
between subsystems. At sufficiently low temperatures only a 
proup of states with very few particles or a group of states with 
ety few empty cells will have non-negligible probability. Con- 
idering these groups of states as the states of the subsystems 
phe sees that the interaction between subsystems is essentially 
ven by an interaction energy between systems in different states 
nly, which is positive and of order ./N times larger in magnitude 
han the interaction energy between particles. This distinguishes 
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cells are assumed to contain only zero or one particle. 
The numbers of empty and occupied cells are denoted 
by No and M,, respectively. 

The relative probability W(N;) of finding N; cells 
occupied at given Gibbs free energy per atom kT Iny is 


W(N1)=y9™1Z@(Ni,N), (2.1) 


Za(Ni,N) =X! CBN, (2.2) 


where >), denotes summation over the occupation 
numbers (0 and 1) of each cell, with the restriction 
that the number of occupied cells is N1, Nii(c) is the 
number of pairs of adjacent occupied cells for the 
configuration o, —2e is the interaction energy per pair 
of adjacent occupied cells, and B=1/kT. 

By means of the identity, 


Nu = 2Mi— 3M10, (2.3) 


where Nio denotes the number of pairs of adjacent 
cells with different occupation numbers, W(N;) can 
be expressed in terms of the relative probability P(41,z) 
that an Ising model taken from a canonical ensemble 
in a magnetic field H=—kT Inz has magnetization 
I=N-(No—N)). One obtains 


W(N1)=2"P(N1,2), 


with 


(2.4) 
with 

P(Ni,2)=24-N >! ec BeNwlo) (2.5) 
where® 


Inz=3(Iny+46e) = —BH. (2.6) 


The function P(N,z) has the obvious symmetry 


property : 
P(Ni,2)=P(N—M,, =). (2.7) 


Some special values of the function P(N1,z) which 
can be obtained by straightforward counting of the 
possible configuration with 0, 1, 2, and 3 particles or 
empty cells are the following” (with x=e-**): 


P(N, z) = 3X, 
P(N—1, 2)=24—N 44, 


1 
P(N—2, 2)= ra (N—5)x8+-4N 2°], 


1 
P(N-3, z) il ad ae 15V-+62)x” 


+N (12N—96)x!°+-36N x8]. 


The corresponding values for Ni=0, 1, 2, and 3 are 
obtained from Eq. (2.7). 


the two-dimensional lattice gas from the one-dimensional lattice 
gas, which also has metastable states for finite subsystems, but 
for which the interaction energy between subsystems is inde- 
pendent of the size of the subsystems. 

9 See reference 7, p. 412, footnote 4, Eq. (A). 

See C. N. Yang, Special Problems of Statistical Mechanics 
(Mimeographed Lecture Notes, University of Washington, 1952), 
Part II, p. 158. 
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We consider the relative maxima NV; =N,(z) of 
P(Ni,2). The function N,(z) is a stepfunction of 2z, 
and we want to establish the fact that for fixed N and 
sufficiently low temperature, N,“)(z) is at, least two 
valued for a region of values of z near z=1. 

Intuitively, the existence of at least one metastable 
state is of course quite obvious: If, for a small positive 
value of H at a sufficiently low temperature, the state 
Ni=N (all spins pointing down) is stable, the state 
with all spins pointing up (No=JN), while very im- 
probable compared with N,=N, is still metastable, 
i.e., more probable than the state Ni= V—1, as long 
as the product of the configuration probability (1) and 
the increase of the magnetic Boltzmann factor (e?4”) 
of the state N;=N—1 does not win out over the 
decrease of the Boltzmann factor for the interaction 
energy (e~**, since 4 antiparallel pairs are created, 
in the toroidal lattice, by reversing one spin). For 
sufficiently small H>0, the state Ni=N is thus meta- 
stable as long as V <e*#*. Since the Curie temperature 
T, for the infinite lattice is given by e*/*?-=v2+1, 
this means metastability for 


T<T,= (3.52/InN)T.. (2.9) 


The essential feature is the dependence on N through 
InN, which shows that even for large numbers NV 
metastability is not restricted to absurdly low temper- 
atures. 

To demonstrate this in more detail we note that the 
value N of N, for which the metastability of the state 
N,=I—1 ends and metastability of the state / starts, 
are (for H=0) obtained as roots of the equations 


P(i,1)=P(I-1, 1), (2.10) 


which are: 
Nx‘=1 for 


(N—5)x4+422=2 for 
(N?—15N+62)x°+ 12(N—8)x*+ 362? 
=3[(N—5)a2+4] for 
We thus have: 
NOx 97-4, 
N® = 2a-4(1 — 222+ (5/2)x*], 
N® =3a-4(1 — (8/3) 22+ (26/9) a4+0 (x5) ]. 


(2.11) 


(2.12) 


At a fixed low temperature the lattice gas of / particles 
and the lattice “liquid” with / holes (J=0, 1, 2, 3) are 
thus metastable at the transition pressure for a lattice 
of N cells, if N<le**«. 

Or, if we consider a lattice of fixed size NV, and put 
all VN in Eq. (2.12) equal to N, the roots x; of these 


equations determine the temperatures 
T,=T, In(v2—1)/Inx, (2.13) 


for which—at H=0—the state /—1 ceases to be meta- 
stable and / starts to be metastable. Since the magnet- 
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ization I; of the state / is 
I,=1-(2I/N), (2.14) 


the saltus from J;_; to Z; occurs at values of x obtained 
as roots of the equation 


T=1—2x'[1+0(a2)]. (2.15) 


The lower point of each saltus of the step function 
representing the metastable magnetization at H=0 as 
function of the temperature for the finite Ising model 
agrees therefore with the known low-temperature ex- 
pansion for the spontaneous magnetization J of the 
infinite Ising model, which is known" to be 


I=1—2x4—8x5—.---; (2.16) 


the relative deviation in 1—|J| is of order x#°~N-, 
The specific volume v, per atom of the gas is given by 


vg *=1/N=}(1-J), (2.17) 
that of the liquid by 
(2.18) 


ny (=1—9,-1, 
At the transition we thus have a step function for the 
most probable specific volume per atom as function of 
the temperature, whose saltus [from (J—1)/N to //N] 
are determined by the roots of the equations 


l/N=x'[1—O(22)]. (2.19) 


For the infinite lattice gas one has 


vy t=a4(1—4a2—---). (2.20) 


Next we obtain a lower bound for the range of values 
of the magnetic field (or Gibbs free energy) for which 
the relatively most probable magnetization (or density) 
is at least two-valued, i.e., for the length of the van der 
Waals type “overshoots.” Because of the symmetry 
[Eq. (2.7)_] we need only to consider the case of the 
liquid. 

We have 


N,®=N-—I1 for 2%-'P(N—I, 1) 
>gN-2040 P(N — (41), 1), (2.21) 
or 
P(N-—1, 1)>2**P(N— (/+1), 1). (2.22) 
The state N,‘*)=N, for instance is stable or metastable 
for 
2>P(N—1, 1)/P(N,1)=Nx4, (2.23) 
or for 
2H<—kT InNx!=4e—kT InN (2.24) 
in the magnetic case. If, therefore, the temperature 
satisfies the inequality (2.9), the region of values H for 
which NV,“ is at least two-valued, i.e., the region of 
metastability, shrinks only with InN’. The width | AH! 


11C, N. Yang, Phys. Rev. 85, 808 (1952). 
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of this region is thus estimated as 
|AH| > 4k In(v2+1){7.—-T InN[4 Ory 05) 
>3.52k(T.—0.28T InN]; 


the inequality sign holds, since other states may extend 
the region of metastability. Correspondingly the region 
of metastability of the Gibbs free energy per atom, 
extends over an interval 2| AH|. 
The region of metastability of the state Vi= V—1 is 
given by 
40 (N—5)x!+422]<22< Nxt. 


At temperatures appreciably below 7, i.e., if 
e<4(N+5)7 or T<1.76T./In[(N+5)/4], (2.27) 


the two inequalities are in contradiction and the state 
N-1 is the relatively least probable state and belongs 
to a retrogressive (probability minimum) branch of the 
stepfunction N,“). For any /, it follows from the 
inequalities (2.21) that the values z; where NV‘) has a 
saltus from NV—/+1 to N—/ are given by 


2¢=P(N—I, 1)/P(N—I+1, 1) (2.28) 


(with 2,20 by definition), as long as the values 2; 
decrease with increasing /. The extension of the step- 
function to the increasing sequence represents proba- 
bility minima. From Eqs. (2.8) we thus obtain, for 
l=1 to 3, and Nxt~1, 


2?=lNx‘(1+0(2)], (2.29) 
and, for the specific volume per atom of the liquid, 
1—IN=1—2;x4[1+0(2’) ]. (2.30) 
Using Eq. (2.6), we then have 
1—IN=1—y;45[1+0(2) ], (2.31) 


where y; is the value of y for which the saltus V—/+1 
to N—/ occurs. The expansion of the reciprocal volume 
per atom in the liquid phase is known” to be 


(2.26) 


t= 1—y-ly8— y-2 (414 5718) 
— y~3 (18x29 — 48222431424) —4y—4(a244 +++) —--> 
=1—y—1x8{1+ (yx4) (40? 52°) 
+ (y~1x*)? (18a4— 482°+ 3125) 


+ (yt) 3 (dat vee)f-- -}. (2.32) 


Since, in the liquid phase, y~!1x*<1, the lower point of 
each saltus of the most probable density of the finite 
lattice liquid approximates the density of the infinite 
lattice liquid, with a relative error of order x°~N~- in 
the density defect 1—»,;"1. 


3. 


We will prove in this section that, at sufficiently low 
temperature, the probability that the number J; of 
particles is in the set S, of non-negative integers / 


® See reference 7, p. 413, Eqs. (22). 
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defined by 
|1—Npn(y) | <L3N ydpn(y)/dy ]# 


lgn<vV/N, (3.1’) 


is at least twice as large as the probability that NV, is a 
number of the set S,, defined by 


|—Npn(y)| >[3Nydpn/dy }} 


lgn<v/N. (3.2’) 


Here, pn(y) denotes the polynomial obtained by omit- 
ting all terms beyond the term with y” from the expan- 
sion of p(y) in ascending powers of y, where p(y) is the 
density of the infinite two-dimensional lattice gas. The 
number J is again the “volume,” i.e., the number of 
cells in the toroidally connected, finite square lattice. 

The probability that NV; is a number in S, is thus a 
fortiori at least twice as large as the probability that V; 
is a number in any set S,’, where S,’ is defined by 
substituting /’ for Npn»(y) in (3.1), provided that S, 
and S,’ have no numbers in common. 

We will then show that for given finite V and n the 
temperature can be chosen low enough so that 


Nopn(y) <n—3[3N ydpn/dy }* (3.3) 


even for a finite range of values of the Gibbs potential 
per atom (kT Iny) above the transition, so that there 
is at least one such set S,’ containing larger numbers 
than S,. There is thus a maximum in the histogram, 
ie., in the step function representing the probability of 
the states grouped together into such non-overlapping 
sets; and the position of this maximum is determined 
by the partial sum p,(y) of the density p(y), i.e., by a 
polynomial extrapolating p(y) beyond the transition. 

In order to prove these statements, we first express 
the probability function for NV; particles in the finite 
toroidal lattice gas, for Ni<+/N, in terms of the 
pressure of the infinite lattice gas. 

The relative probability W (2) for Ni=/ is given by 


W (D2) = (ya-*)'P(,1) (3.4) 


according to Eqs. (2.4), (2.5), and (2.6), where Eq. 
(2.5) defines P(/,1). 
The grand partition function 9y(y) is defined by 


N 
Qn(y) =2 W()). 


If we were to expand N—'1n9y(y) by means of the 
Ursell expansion in ascending powers of y, with Mayer’s 
reducible cluster integrals as coefficients (these are 
sums in the case of the lattice gas), we would find that 
the first \/N—1 cluster integrals are independent of V 
since only for Ni>4/N chains around the torus can 
occur. The expansion of N—'InQw(y) in ascending 


(3.1) 
and 


(3.2) 
and 


(3.5) 


1%3In a real gas, too, the first »<(Vi/2a) reducible cluster 
integrals are independent of the volume V, if the gas is considered 
in a cubical cell of volume V which is connected to itself toroidally 
by identification of the opposite surfaces, and a denotes the range 
of intermolecular forces. 
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powers of must, therefore, agree with the expansion of 


8p(y)= lim [N~ InQv(y) ] (3.6) 


up to, and including, the term with yY"—. The first 
(4/N—1) coefficients P(/,1) for the toroidal Ising model 
of N sites can thus be expressed in terms of the pressure 
p(y) of the infinite Ising model, in the form 


0,+ 


P(L,1)=2"(2mi)1 @ —-y-“dyeX PW) for L<4/N, (3.7) 


where the path of integration must be chosen within 
the circle of convergence (|y|<zx*)* of p(y). This 
restriction can be immediately removed, however, since 
p(y) can be replaced without error by the polynomial 
p»(y) obtained by omitting all terms beyond the term 
with y" in the power series for p(y), if lgn<+/N. 

Expanding both sides of Eq. (3.7) in powers of N we 
find that the coefficient P® (/,1) of N in P(N,1) is equal 
to the coefficient of y' in «*"8p(y). We thus obtain 


Boal”) =D y-"PO(1) for n<V/N (3.8) 
l=1 


and, using Eqs. (2.8) and (2.7), 


(ya-*)? 
Bp(y)=y+ as [4a®— 5x8] 


(ya-*)8 
+ = [36x°—96a"94+62x!2]+ --- 


(3.9) 


in agreement with the known expansion of 6p(y).!®> The 
lowest exponent of x in the coefficient of (yx~*)! is the 
least number Nio*(/) of pairs of adjacent cells with 
different occupation numbers in a gas of / particles. The 
coefficient of (ya~*)'x%10*® equals N-! times the number 
of ways of obtaining a configuration N1o*(/) and is thus 
positive. It is, therefore, possible to choose a temper- 
ature, i.e., a number x,, small enough, so that, for all 
¥Qx,, all coefficients of the polynomial #,(y) are 
positive. [We note that 


Pn(y) =Bydpn(y)/dy (3.10) 
and ydp,,(y)/dy are positive for positive y, and that 


BPn(¥)<pnly)<ydpn/dy for y>0O.] (3.11) 


For x<€£%,, the terms in the power series for 
exp|V8p,(y)_], expanded in ascending powers of y, are 
all positive, for real y>O, and may therefore be con- 
sidered as relative probabilities. We define an extended 
probability function W(/) for all non-negative integers 


by 

~ 1 

W (1) =exp[—N6p.(y) ly'— rs yh 
2n1 


XexpLNBpn(n)]dn. (3.12) 


4 See reference 7, p. 413. 
15 See reference 7, p. 412. 
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For all lg n<-+/N, we have, specially, 
W(D)=exp[—NBpa(y) WD. (3.13) 

To the extended probability function we apply the 

Bienaymé-Tchebycheff'* inequality : 


Prob{ |/—1| > Ko} < K~, (3.14) 


where 


ly 1W()=Npaly), (3.15) 


o*=¥(l—)*W()=Nydon(y)/dy, (3.16) 


and o>0, K>0. This implies, specially, 
Prob{ |J—1| <ov3}>2 Prob{|/—1| >ov3} (3.17) 


‘or 


© W)22 WO, (3.18) 


Ics IcS 
where S and § denote the sets of all non-negative 
integers satisfying |J—1|<ov3 and |/—1| >0V3, re- 
spectively. The symbol c is the symbol for “contained 
in,” as used in set theory. 


If 
i<n—oN3, (3.19) 


the inequality (3.18) becomes an inequality for the 
physically meaningful probabilities W (J): 


o W225 WO, 


ICSna 1CSna 


(3.20) 


where S,, and §, are obtained by omitting all numbers 
l>n from S and S, since the omission does not affect 
the left hand side of (3.18) [because of (3.19) ], and 
diminishes the right-hand side of (3.18), and since 
W(l) and W (J) are related by Eq. (3.13). 

To complete the proof of the conclusions stated after 
(3.3), we will actually show that there is a range of 
temperatures sufficiently low so that the stronger 
inequality 

l<n—30v3 (3.19’) 


holds. The inequalities (3.19’) and (3.3) are equivalent, 
according to Eqs. (3.15) and (3.16). The inequality 
(3.20) remains a fortiori correct if (3.19’) is chosen 
instead of (3.19). 


We define the variable u by 
u=yx-4, (3.21) 


and consider p,(u«*) as a polynomial in w: 


(3.22) 


pa(ust) => ulqr(2), 


where q:(x) is a polynomial of degree 4/ in x. The lowest 


16H. Cramér, Mathematical Methods of Statistics (Princeton 
University Press, Princeton, 1951), pp. 182, 183. 
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METASTABLE STATES OF FINITE LATTICE GAS 


exponent of x occurring in q;(x) is Ny0*(1) [defined 
after Eq. (3.9)], and Nio*(J) is nondecreasing for 
increasing 1. We know, from Eq. (3.9), that 


qi(x)=x4, g(x) =4x5— 528; (3.23) 


the lowest exponent of x occurring in any of the poly- 
nomials 2~g;(~), for J>2, is thus 22, and we have, 
for any fixed value of u, 


limpn'yOpn/ Oy 


ip n 5 n 
=lim] uat+>> itq(a)| /| w+ wa) 
L l=2 l=2 


0) 





=jiml 14° m-ai(a)e-] /[143 wiq(a)e] 
4 l=2 


1=2 
wh, , (3.24) 
Since ydpn/Ay2pn20, for y>O, there is a number 

x'(n,u,r)>0, such that, for all positive x< x’ (n,u,r), 
yOpn/ OYE Tn, (3.25) 


if r is an arbitrarily chosen number larger than unity. 
The choice of r limits the width of the class-interval in 
the histogram, since 


o=[Nydp,/dy]*< (rl)! 


Since p,—0 for x0 at fixed u, there is also a number 
«"(n,u,r,N)>0 such that 


27\* 27 \3/? 
(n+—r) -(—) (3.27) 
4 4 
for all positive «<x’’(n,u,r,N). Both the inequalities 


(3.25) and (3.27) are valid if 
xX min{xp,x’ (n,u,r) x’ (n,u,r,N)}, 


(3.26) 
png N7 


(3.28) 


where x, was introduced earlier to insure positive 
coefficients of pa(y). From (3.25) and (3.27) we obtain 


lt-o4/27=Npnat(27N ydpn/dy)!< Non +(27rNp,)! 


(+7) -() | 
+07 (14) - "*) |, (3.29) 


which reduces to the inequalities (3.19’), or (3.3). This 
completes the proof of the conclusions stated after the 
inequality (3.3). 

We note that for fixed n, r, and » and sufficiently 
large N the inequality «<2’’(n,u,r,N) becomes the 
strongest restriction. 


4. RESULTS AND DISCUSSION 


We have demonstrated the existence of metastable 
states (or groups of states) for a finite version of the 
lattice gas of Lee and Yang.’ We consider a lattice gas 
in a “volume” consisting of a square (toroidally con- 
nected) of N sites of a square lattice. The interaction 
energy for each pair of occupied adjacent sites is 
denoted by —2e, the critical temperature T, of the 
infinite model is then given’? by e~*/*?-=v2—1, where 
k is the Boltzmann constant. We denote by WN; the 
number of occupied sites in the finite model and by y 
the fugacity; the Gibbs free energy per particle is 
g=hkT Iny, where T is the temperature. Condensation 
occurs in the infinite model’ for y=‘ or g= —4e. 

The (relatively or absolutely) most probable value 
N, of WN; is considered as a function of y (or g) and T. 
In Sec. 2 we have shown that the saltus of the step 
function NV,“ from N—/+1 to N—/ occurs at y=y; 
or g=gi, where, for small x with Nx‘~1, 


y= NEx®[1+0(27) ] (4.1) 


and 
gi= —4e—3.52k{ T.—0.28 In[N/“"(1+0(22))]} (4.2) 


for =1, 2, 3. Because of the invariance of the model 
under exchange of empty and occupied sites with 
simultaneous substitution of yx‘ for ya‘, the saltus 
from J—1 to / occurs at y=y;' or g=g,’, where, for 
small x with Nx‘~1, 


yt = {FN[1+0(2) }}7, 
gr = —4e+-3.5k{T.—0.28T In[N/“(1+0 (2*)) J} 
for /=1, 2, 3. 
For /=1 these values are exact without the correction 
term O(x?), and we have 
Ni\®=N for g>kT In(N2x°) = —4e+kT In(Nx*), (4.5) 
Ni%=0 for g<—kT InN=—4e—hT In(Nx'‘). (4.6) 


(4.3) 
(4.4) 


If N<a~ or T<3.52T,/InN, these two regions overlap 
and N, is at least two-valued in an interval (—4e 
—Ag/2, —4e+Ag/2), where 


Ag> —2kT In(Nx*)=7.04k(T.—0.28T InN]. (4.7) 


The inequality sign holds, since the region of two- 
valuedness may be extended by other occupation 
numbers. 

We note especially that the region of metastability 
can shrink only slowly with increasing J, since its lower 
bound depends on InN. On the other hand, we have 
shown that the upper points of the first three saltus of 
the most probable density of the gas approach the gas 
density of the infinite model with a relative error of 
order V-} in the stable region. (An analogous result 
holds for the density defect of the liquid.) Considered 
as an approximation to the case of the infinite lattice 
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gas the finite model thus yields quantitatively good 
results in the stable regions, but is qualitatively poor in 
yielding a multivalued function for the density. If the 
infinite lattice gas is considered as built up of finite 
models with interaction, the interaction must therefore 
cause the density of the infinite model to change 
discontinuously from one of these branches of the most 
probable density of the finite model to the other, yet 
cause only small corrections to the values of the density 
in the stable regions. It is clear from reference 8 and 
from the considerations of our previous paper,! that 
this is not a paradox. Since we were interested primarily 
in the existence of metastable branches, we have not 
proven that there are only the two maxima treated here. 

Since the problem of the infinite lattice gas with 
arbitrary Gibbs free energy (i.e., the Ising model with 
magnetic field) has not yet been solved in closed form, 
and only a few terms of the series expansions for the 
thermodynamic functions are known, we could not 
extend the preceding results in a straight-forward 
manner to larger values of NV; or N—JN;. Rather than 
attempting to show maxima in the probability function 
for N;, we have shown that there is, at sufficiently low 
temperatures, a relatively narrow group of numbers NV; 
more probable than similar neighboring groups, even 
at values of the Gibbs free energy exceeding the conden- 
sation value. The center of this group is obtained as V 
times a partial sum of the series for the density of the 
infinite model in powers of the fugacity. 

We define the polynomial p,(y) as the partial sum of 
the first m terms of the series for the density p(y) of 
the infinite lattice gas expanded in ascending powers 
of y (not counting the constant term which vanishes) 
where n is a positive integer smaller than N? and 
otherwise arbitrary. We prove that there is a finite 
range of temperatures sufficiently low so that p, has 
positive coefficients. We define sets S, and S, of non- 


negative integers / by 
|J—Np,| <ov3, ln, (4.8) 


and 
|1—Npn| 20V3, Ign, (4.9) 
respectively, where o is defined by 
o’=Nydp,/dy, o>0. (4.10) 


We then prove that there is for any fixed positive 
value of u=yx~‘, and for any number r>1 a finite 
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range of temperatures sufficiently low so that 
a (rNpn)? 


(4.11) 


and 
NpnX<n—30v3, (4.12) 


and that the probability that 1, is one of the numbers 
in S, is at least twice as large as the probability that J, 
is in §,, and thus a fortiori twice as large as the proba- 
bility that NV; is a number in any set S,’ defined by 
replacing Np, by any number /’ with |l/—Np,| > 2013. 
There is at least one such set S,’ containing larger 
numbers than S,. 

While we make no statements about the probability 
function for the occupation numbers individually, we 
have thus shown the existence of a maximum in a 
histogram with a relatively narrow class interval 
[determined by (4.11) ]. The proof is not limited to 
values of g below the condensation point, since 


g=—4e+kT Inu (4.13) 


and « could be any finite positive number. This is not 
in contradiction” with the statement that the proba- 
bility function for N;/N has, in the limit N—> ata 
fixed nonvanishing temperature below 7,, a single flat 
maximum if g= —4e, since the inequality (4.12) requires 
that the temperature approaches zero if N—, and 
the inequality (4.11) can be satisfied for temperatures 
different from zero only for finite values of «. 

In the stable region, the power series for p(y) is 
known to be convergent, and is, as far as it is known, 
very rapidly convergent at low temperature, so that 
pn(y) which determines the center of the most probable 
group of states is a very good approximation to p(y) in 
this region. The shrinking of the metastable region Ag, 
however, can be seen to depend, for sufficiently large 
N, on InN. 

The author would like to thank Professor J. R. 
Oppenheimer for the hospitality of the Institute for 
Advanced Study, The John Simon Guggenheim 
Memorial Foundation for a Fellowship, and North- 
western University for a leave for research. 

17S. Katsura, J. Chem. Phys. 22, 1277 (1954) mentions some 
results of himself and C. Katsura, Busseiron Kenkyu No. 70 
(1954) concerning finite Ising models, and states that “‘the 
sharpness of the maxima becomes more remarkable as the number 
of sites increases.”” We have not yet seen this paper, but would 
doubt this statement, since it would lead to a paradox, as S. 


Katsura, too, recognizes, unless, e.g., the temperature range is 
made to decrease to zero with increasing NV. 
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General differential equations are derived for the time history of a thermodynamic system undergoing 
irreversible transformations. This is done by using Onsager’s principle, and introducing generalized concepts 
of free energy and thermodynamic potentials. From these equations it is shown that the instantaneous 
evolution of the system satisfies a principle of minimum rate of entropy production. It is also shown how 
Prigogine’s theorem for the stationary state fits into the present theory. Another variational principle is 
established for the case where certain variables are ignored in analogy with the methods of virtual work 
in mechanics. This principle which applies to complex physical-chemical systems is developed more specifi- 
cally for viscoelastic phenomena, and as an example the differential equations for the deflection of a visco- 


elastic plate is derived. 





1. INTRODUCTION 


T has long been known that a physical system 

undergoing transformation has a tendency to move 
in a direction of increasing entropy. This is usually 
expressed from a statistical viewpoint by stating that 
the evolution is toward a more probable state or more 
disorder. This principle is formulated mathematically 
in classical thermodynamics by the property that the 
Helmholtz thermodynamic potential is a minimum at 
equilibrium. This field of thermodynamics which deals 
with equilibrium problems could more justifiably be 
called thermostatics. 

There has recently been growing a new body of 
knowledge which deals essentially with nonequilibrium 
or irreversible phenomena and which more properly 
deserves the appellation of thermodynamics. Great 
impetus was given to this development from a unified 
standpoint by Onsager’s theorem which is essentially a 
reciprocity law of coupled irreversible phenomena. The 
question of the existence and formulation of variational 
principles dealing with such irreversible phenomena is 
the object of the present paper. It will be shown, for 
instance, that it is quite a general property that a system 
tends toward the most disordered state but that this 
occurs with a minimum rate of production of this 
disorder or entropy. 

A first step in this direction was made by Prigogine’ 
who formulated a theorem of minimum production of 
entropy for a thermodynamic system which is in a 
stationary state, i.e., in a steady state of flow. Such a 
system for instance is one which is traversed by a steady 
flow of heat. We are concerned here with principles 
which are of a more general nature and which do not 
require steady flow. 

Section 2 develops the basic differential equations for 
irreversible phenomena by the application of Onsager’s 
principle. A quite general formulation is obtained for a 
perturbed system by the artifice of adjoining to the 
system considered a large heat reservoir at constant 


* Consultant. 
1See S. R. De Groot, Thermodynamics of Irreversible Processes 
(Interscience Publishers, Inc., New York, 1952). 


temperature. The entropy of the total system gives a 
generalization of the concept of thermodynamic poten- 
tial for the case of nonuniform temperatures. Several 
solutions of the basic equations are presented in Sec. 3 
based on results obtained by the writer in a previous 
publication.? Equations for a perturbed system were 
also derived by statistical methods by Onsager and 
Machlup.* 

A principle of minimum production of entropy is 
established in Sec. 4. It deals with the instantaneous 
direction of evolution of the systems under any non- 
equilibrium conditions. Section 5 deals with relaxation 
modes and leads to a new viewpoint in formulating the 
variational principles for stationary flow. 

The case of a system for which certain coordinates 
are hidden is taken up in Sec. 6. The variational prin- 
ciple developed in this connection constitutes a powerful 
tool for the calculation of a wide variety of phenomena, 
involving, e.g., chemical reactions and heat transfer 
in complicated systems. It is also of particular usefulness 
in viscoelasticity. How this is done in general is shown 
by introducing the operational tensor for the stress- 
strain relations.? As an example in Sec. 7 it is applied 
to the derivation of the integro-differential equations 
for the deflection of a viscoelastic plate. 


2. BASIC THERMODYNAMIC RELATIONS 


We consider a system I defined by thermodynamic 
state variables. These state variables are taken here to 
be of quite general nature and may represent such varied 
physical quantities as a strain tensor, electric charges, 
local temperatures, concentrations, etc. The entropy of 
such a system is defined by subdividing it into cells and 
summing the entropy for each of these cells. This 
assumes, of course, that each cell is in a state of quasi 
equilibrium so that its entropy may be defined as if it 
were in equilibrium. The legitimacy of this definition 
was investigated by Prigogine.! It could also be com- 
puted directly, of course, by means of Boltzmann’s 
relation expressing the entropy directly in terms of 


2M. A. Biot, J. Appl. Phys. 25, 1385 (1954). 
3L. Onsager and S. Machlup, Phys. Rev. 91, 1505 (1953). 
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certain statistical or disorder parameters as done in 
problems of second-order transitions. 

The system is characterized by m variables g; which 
are defined as the departure from a certain reference 
state taken as origin and for which g;=0. Only small 
departures from the reference state are considered and 
it is assumed that in this range of variation the system 
remains linear. This will generally be true if the system 
is in the vicinity of an equilibrium state. 

In order to apply the principles of irreversible ther- 
modynamics we must consider an isolated system. We 
therefore adjoin to system I a system II which is a 
large reservoir at constant temperature T. The total 
system I+II is now assumed to be isolated and its 
entropy is expressed as the sum of the entropies of 
each system: 

S=$,+S1. (2.1) 


Let us now find an expression for the entropy S. We 
consider the heat dh absorbed by System I from the 
reservoir IT. Conservation of energy requires: 


dh=dU,—Y: O.dgi, (2.2) 


where U; is the internal energy of System I and Q; isa 
generalized “external force” conjugate to the state 
variable g;. This equation may be considered to define 
the external force as a perturbation acting upon the 
system in a very general sense. It can be for instance a 
stress or an electromotive force or can be proportional 
to a chemical affinity as defined by De Donder. The 
external forces may be considered part of the isolated 
system by adding corresponding large energy reservoirs. 
The increment of entropy acquired by the reservoir IT 
is therefore: 


dh dU, 0; 
dS = ——= —-— +), —d4i, 
is zs Soe 


(2.3) 


and the increment of entropy of the total system will be 
dS=d5S,+dSn, 


aU, 0; 
dS =dS}—-——+Y —dqi. 
hi a i 


(2.4) 


We now define the reference state or zero state for 
which all coordinates g;=0 as that for which all external 
forces Q; are zero and in which the system is in equi- 
librium at uniform temperature T. The entropy S’ of 
the system I+II when Q;=0, derived from (2.4), is 
given by 
(2.5) 


TS’ = TS — j= —3 Zs 4459 19;- 
ij 


Since we are dealing with an equilibrium state, the 
entropy S’ is a maximum and the quadratic form 
V=3X assgigs (2.6) 

7) 


is positive-definite. 


BIOT 


From (2.4) we also derive that under the forces (,, 
the entropy S of the system I+-II is given by 


TS=—V+D: Qigi. (2.7) 


The factor 7, which is the constant temperature of the 
reservoir II, is introduced as a factor for convenience, 

If the system is displaced from the zero level by 
applying the external forces very slowly and reversibly, 
the system follows a succession of equilibrium states 
given by the condition that the entropy is a maximun, 
i.e., by the m equations: 


0S/dqi= —OV/dqi+Q:=0. (2.8) 


We now consider irreversible processes for which the 
partial derivatives of the entropy do not vanish. 
Onsager’s principle! may be applied to this case. It may 
be stated in the following form, which is formally dif- 
ferent from the usual one but may be seen to be 
equivalent: 


TAS/8qie= Lo 5 543453 (2.9) 


namely, the derivatives of the entropy are linear func- 
tions of the time rates of change g; of the state variables 
and the matrix of coefficients is symmetric, 


b= bj: (2.10) 


It should be noted in applying Onsager’s relations 
(2.10) to arbitrary perturbations that because of linear- 
ity the principle of superposition is valid and that the 
system responds as a succession of relaxations under 
successive applications of constant force increments. 

We introduce the quadratic form 


D=} XD disGidi. 
oF] 
From (2.9) we derive 
1 aS 1 aS 


D=-T > q—=-T—. 
2 «+ 0q; 2 oO 


(2.11) 


(2.12) 


The quadratic form D is positive-definite since it is 
proportional to the time rate of production of entropy. 

From Eggs. (2.7) and (2.9), we derive the basic rela- 
tions of irreversible processes: 


Di AQ +L; b5G5= Qi. (2.13) 


By using the quadratic expressions V and D, they may 
be written in the Lagrangian form: 


8V/dq:+0D/0g:=Qi. (2.14) 


The invariant V plays the role of a potential energy 
and D that of a dissipation function. 

It is interesting to note the thermodynamic signi- 
ficance of V and S. From (2.4), we have 


Téi5= TdS;—dU1+->D:: Q:dqi. (2.15) 


Now, suppose that the only external force acting on the 
system is a constant pressure P. The conjugate variable 
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is the volume —2, and we may write 


dy: Odqi= — Pd. 
Integrating (2.15), 


—TS=U;—TS,+P. (2.17) 


If the temperature is uniform throughout system I, 
this expression represents its Gibbs thermodynamic 
potential so that —7\S may be considered as the ex- 
tension of the concept of thermodynamic potential for 
the case of nonuniform temperature and any kind of 
external force Q;. Similarly, the expression 


U;—-TS;=V (2.18) 


may be considered an extension of Helmholtz’s free 
energy concept. 

Equations (2.13) and (2.14) for a system in the 
vicinity of equilibrium apply to a large class of phe- 
nomena. They may involve, e.g., mechanical dissipation 
and elastic forces, heat transfer, chemical reactions, 
electric currents and charges, as well as the coupling 
between these phenomena. It may be shown that an 
excess temperature applied to a boundary is an external 
force with the entropy flow as the conjugate coordinate. 
In problems which are open to treatment by either 
classical or quantum statistics, the expression for V 
may be obtained directly from the partition function. 
Equations (2.13) and (2.14) may also be represented 
by a network of springs and dashpots or an RC net- 
work. Such a network constitutes therefore an analog 
computer for the large class of phenomena included in 
the present theory.! 


(2.16) 


3. SOLUTION OF THE BASIC EQUATIONS AS 
RELAXATION MODES OR STATIONARY 
FLOW 


Consider a system to which constant forces Q; are 
suddenly applied. The system will obviously tend 
toward some sort of new equilibrium state. This equi- 
librium state will either be one of static equilibrium 
where all coordinates are constant, or one in which 
there is steady flow, i.e., in which all coordinates vary 
proportionally with time. Proof of this follows from 
expression (4.10) in reference 2 which gives the general 
solution of Eqs. (2.1) in the operational form: 

n Cx Vg 
gi=L [z nanan ules (3.1) 
imiL ¢ p+ A, 
where p=d/dt and —A, are distinct roots of the deter- 
minant: 


det | ait pbi;| =0 (3.2) 


with » as unknown. We have shown? that the values of 
\, are never negative and that the solution is completely 


‘ The possibility of extending the electric analog to phenomena 
involving coupling between heat transfer and mechanical energy 
was pointed out by C. F. Kayan, “Electrical analogger appli- 
cation to the heat pump process,” Heating Piping and Air Con- 
ditioning, July, 1953. 
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general and is not restricted by any singularity of the 
matrices or multiplicity of the roots. 

If the forces Q; are constant and are suddenly applied 
at the instant =0, they may be represented in terms 
of the unit step function 1(¢) as 


Q:=O*1(, (3.3) 


with constants Q,*. Substituting in (3.1), we make use 
of the operational relation 


1 
1()=—(1-e*4), 
ptr. Os 


We first assume that none of the roots \, are zero. Hence 
(3.1) may be written 


(3.4) 


mf Cui n _ Ci3'*Q;* 
w=E[Z “ +Cu}o*-¥ a om, 


j=1 8 i=l 8 8 


(3.5) 


We omit the factor 1(#) in (3.4) and (3.5). If none of 
the roots are zero, we see that the system toward a set 
of constant equilibrium values for the coordinates q;. 
The variable part of the"motion may be resolved into 
a sum of columns, each of which is characterized by a 
certain exponential decay and which we may call modes 
of relaxation. However, if some of the roots X, are zero, 
then there is a term of the type C;;“/p in expression 
(3.1) corresponding to the operational relation 


(3.6) 


1 
—_ ()=1, 
p 


which yields in expression (3.5) an additional term of 
the type 


qi=t dD Cis'Q;*. 


j=1 


(3.7) 


This corresponds to a stationary flow. We have thus 
established that the system tends toward a fixed devia- 
tion or a steady flow. With 


qi =L Ci/°Q;*, 


j=1 


(3.8) 


we may also write 


qi=Qirt (3.9) 


and q;* represents the stationary state velocities. 


4. A GENERAL PRINCIPLE OF MINIMUM RATE 
ON ENTROPY PRODUCTION 


In the previous section we have formulated the 
general solutions of the system in its evolution toward 
equilibrium. A somewhat related question is the fol- 
lowing. In the configuration space of the state variables 
qi, the thermodynamic state of a system is represented 
by a point of coordinates g; When not in equilibrium 
the system is subject to forces, both internal and ex- 
ternal, which are expressed by Q;—0V/0g; and which 
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we shall call “dis-equilibrium forces.” These forces may 
be considered as proportional to the derivatives of the 
generalized thermodynamic potential. The instan- 
taneous direction of evolution of the system in the con- 
figuration space is represented by the velocity vector 4;. 
The velocity components g; are also denoted by J; in 
the literature and are called fluxes. The question arises 
whether the direction of this vector can be determined 
by a variational principle. 

Let us first write the fundamental equation (2.14) 
in a somewhat different form. We denote the dis- 
equilibrium forces by 


Xi=Qi—0V/0qi. (4.1) 


The Lagrangian equations (2.14) are then written 


Consider now the quadratic form D as a function of the 
n velocity components, and the condition that D be an 
extremum when we consider all possible values of the 
vector g; under the constraint that the vector q; satisfy 
the relation 


>d: XiGi= const, (4.3) 
with given values of the forces. This leads to the ab- 
solute variational condition, 


(0D/dgi— kX ,)6q:= 0, (4.4) 


with an undetermined Lagrangian multiplier &. Except 
for this factor, the variational condition (4.4) is equiva- 
lent to the equation of motion (4.2). The variational 
principle therefore determines the direction of the 
velocity vector g;. The undetermined magnitude of the 
vector may be fixed by the condition 

2D=D: Xidi, (4.5) 
which expresses that the rate of energy dissipation is 
equal to the power input. 

Since D is a position-definite quadratic form, the 
extremum corresponds to a minimum. Moreover, D is 
proportional to the rate of entropy production associ- 
ated with the velocities g; of the system. Hence, we 
state the following theorem: 

Considering a system which is not in equilibrium, its 
instantaneous velocity direction is such that the rate of 
entropy production is a minimum for all possible velocity 
vectors satisfying the condition that the power input of the 
dis-equilibrium forces is constant. 

A dual form of Eqs. (4.2) are obtained if we express 
D in terms of the forces X; instead of g;. From Euler’s 
theorem on homogeneous functions, we have 


(4.6) 


aD 
2D=L —Gi=L GX. 
i 0g: i 


Hence, 


2dD=i i G:dX +L: X di. (4.7) 


0D 
dD=>> —dq;= i X dqi, 
i 0g: i 


= >: GidX é 
We therefore have the dual form of Eqs. (4.2): 
0D/dX s= qi. (4.10) 


We may state a dual minimum entropy production 
theorem identical with the above except that the vari- 
ables X; and q; are interchanged. The minimizing vector 
X; for D is then in the configuration space of the forces. 

It should be noted that the minimum theorems ex- 
pressed here may be formulated in other mathematically 
equivalent forms. For instance, we may state that Eqs. 
(4.2) are equivalent to the statement that the quantity 


P=D-ViXidi (4.11) 


is a minimum. Another equivalent statement is that 
under the restraint that the energy dissipated is a 
constant the power input is a maximum. Certain known 
minimum theorems on energy dissipation in electro- 
dynamics and fluid mechanics are particular cases of 
the above.* 


(4.8) 


we derive 
(4.9) 


5. MINIMUM PRINCIPLE FOR STATIONARY STATES 
AND RELAXATION MODES 


We have seen in Sec. 3 that if there are characteristic 
roots \, of the system which vanish, the system will 
tend toward a stationary state which is defined by (3.9) 
and for which all velocities are constant. This stationary 
state is such that for all coordinates q; in the direction 
of motion the “restoring force” vanishes, i.e., 


aV/dq:=0. (5.1) 


In that direction the system remains under constant 
dis-equilibrium forces: 
X;=(i. (5.2) 


The minimum theorem of the previous section applies 
to this case, but the condition of constant power input 
is now 


DX: Qigs= const. (5.3) 


A corresponding statement is of course valid for the 
dual form of the theorem. 

The minimum principle considered until now deter- 
mines the instantaneous velocity of the system. There 
are, however, as we shall now proceed to show, different 
variational properties which refer to the long-range 
time history of the system. 

Let us evaluate the rate of entropy production during 
the evolution of the system toward equilibrium or a 
stationary state. We have seen in reference 2 that the 
general equations (2.14) may be written by using 


4See, e.g., J. H. Jeans, The Mathematical Theory of Electricity 
and Magnetism (Cambridge University Press, London, 1933), 
p. 321. 





no: 


wh 
rel 


Nc 


an 


Wi 


PRINCIPLES IN IRREVERSIBLE THERMODYNAMICS 


normal coordinates £,. The transformation is 
i= Lis b:*és, 


where ¢;* is the modal column corresponding to the 
relaxation mode s. The corresponding normal forces are 


Ze= 2 $5°Q). (5.5) 


The modal columns have the property of being orthog- 
onal, namely, 


Li 4:36 b= L i36;°bi=0, sr. 
ay 7] 


(5.4) 


(5.6) 


Normalization is in such a way that if \,~ ©, 


Le bis i*b;*=1; (5.7) 
% 


and if A= %, 


) 0530;°O;°= ie (5.8) 
i 


With these coordinates, the functions V and D become 
V=3 Lee Neer +5 Lk fi’, D=} p@ - (5.9) 


where the &” terms correspond to cases of infinite roots. 
Equations (2.14) become (p=d/d1) : 


(pt+r.) f=, 
Solutions of the first equations are 


i 
t=——=—(1-e 9, 


ptr. As 


If some of the roots A, are zero, we denote them by An 
and write 


(5.10) 


&. =k. 


(5.11) 


Em= Em. 


The rate of production of entropy is 


dS 2D 1 
—=—-[-> eet 24 Ene ; 
yi ch & m 


(5.12) 


We may state the following property: The rate of 
production of entropy is a monotonically decreasing func- 
lion which tends toward a constant. All higher time deriva- 
lives of the entropy also decrease monotonically and tend 
{0 zero. 

We note that &,, £m are proportional to the dis- 
equilibrium forces X,, Xm applied to each normal coor- 
dinate: 


£n= Ein Xm 


£,=2.—debs = Xo, 
We may write the rate of entropy production as 
dS/dt=4 >. X2+4 Dm Xv2. (5.14) 


The stationary state corresponds to X,=0. Therefore, 
in the stationary state the entropy production con- 
sidered as a function of the dis-equilibrium forces is a 
minimum under the constraint that the forces Xm cor- 


(5.13) 
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responding to the stationary velocities are kept constant. 
This latter property of the stationary state corresponds 
to a theorem already formulated by Prigogine! but 
derived in a different way. 

Another variational property refers to the modes of 
relaxation themselves. The modal column ¢,* satisfy 
the equations: 


Di aii? —As. D5 bio =0, 


which result from the variational condition that 
D= bib; 


be an extremum under the constraint 


(5.15) 


(5.16) 


V =}32,;6.0j= const. (5.17) 


As a familiar example of a system tending toward a 
steady state, we might visualize the one-dimensional 
flow of heat across a wall, one side of which is suddenly 
brought to a constant higher temperature. The system 
tends to a steady state when the distribution of tem- 
perature is linear and the rate of entropy production 
is constant. The only remaining time varying coordinate 
is the total entropy input which is proportional to the 
time. The unsteady part of the temperature distribution 
is a superposition of sinusoidal modes, each with its 
own exponential decay. 


6. VARIATIONAL PRINCIPLE FOR THE CASE OF 
HIDDEN COORDINATES 


Up to now we have involved all the degrees of freedom 
in the variational equations. However, it is possible to 
introduce a variational principle which involves only a 
partial number of the total degrees of freedom. We 
have shown? that for a system with m degrees of free- 
dom, if & variables are observed, the forces applied to 
these degrees of freedom are expressed in terms of the 
coordinates as 


(6.1) 


k 
O= LD Tigi, 


j=1 


with 
p 


T= Di+Di;+Di;'p, (6.2) 


, p+r, 


where the D’s are constants, the r’s are decay constants, 
and # is the time operator. The symmetry of the coef- 
ficients 


Ti5= Ti, 
leads to a quadratic invariant: 


T=3 2X Tisqsgi, (6.3) 
Q 


and Eq. (6.1) may be expressed by the relation 
Q.69:= 61 =(L5 Tis9i 1695, (6.4) 


to be satisfied identically for all virtual displacements 
5q,. 
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This variational principle, as formulated here, applies 
to all phenomena expressible by the basic thermo- 
dynamic equations on which the present paper is based 
and is therefore quite general. As an example of the 
fecundity of this principle, it is of interest to formulate 
it more specifically for the case of a viscoelastic con- 
tinuum. In terms of the stress tensor o,, and the strain 
tensor é,», it was shown? that for an anisotropic material 
the relations are 

ow=d P ww 03, 


ij 


(6.5) 
with 


P,,4= 2 Dyy*?®+-Dyy*+ pD' wy. (6.6) 


8 ptr; 


The corresponding operational invariant is, without the 


summation signs, 7 
[= 3 Pus *te sje", (6.7) 


and the variational principle may be expressed as 


ode" = I. (6.8) 


The usefulness of this formulation lies in the fact that 
since the internal stress field is in equilibrium, the total 
virtual work is equal to that of the forces applied to the 
boundary of the continuum. Denoting by F, this 
boundary force and by x* the boundary coordinates, 
we have 


(6.9) 


If feosrara ff ras, 
Vv ; s 


where the volume integral is taken in the volume V 
bounded by S. Hence the variational principle in the 


form 


(6.10) 


ff rarase ff fer 


The procedure exemplified here for a viscoelastic con- 
tinuum is not restricted to the case of a stress field and 
may be used to analyze the time history of complex 
physical chemical systems, by means of a suitable choice 
of generalized coordinates in a way quite analogous to 
the example treated hereafter. The disappearance of the 
virtual work of the internal forces is then replaced by 
the more general condition of conservation of mass and 
energy fluxes between the interacting cells. 

In the above derivation dynamic effects have been 
neglected. It can be easily verified that the acceleration 
of the observed coordinates may be included by in- 
troducing the virtual work of the inertia forces as done 
in the expression of d’Alembert’s principle. 


7. APPLICATION TO THE BENDING OF A 
VISCOELASTIC PLATE 


As an example of the variational method, we shall 
treat the problem of two-dimensional bending of a 
viscoelastic plate of isotropic homogeneous material. 


BIOT 


The stress-strain law of such a material is expressed 
operationally as? 


Cw = 20Cwt+5yRe, 
1, w=» 


b= 
0, pr, 


e=> Swv€ ur; 
uy 


with the operators Q and R given by 


?Q° pR* 
Q= -0,(, R=>d +R+R,’. (7.2) 
s p+r, sp 


Ts 


With Cartesian components of displacement wu, 2, w, 
the strain tensor is defined as 


Ou 1s0v ou 
Czs=—, etc.; ta=(—+—), 
Ox 2\dx dy 


We consider a plate of thickness 2h. The xy-plane is 
parallel with the faces located at z=+h/2. We choose 
as a representative deformation : 


(7.4) 


This constitutes a two-dimensional bending and shear- 
ing deformation parallel with the «z-plane. The func- 
tions 91, 92 of x are to be determined. Components of the 
strain tensor are 


u=2q2(x), v=0, w=qi(zx). 


€rz= 0u/Ox= 2dq2/dx, 
yy = Czz = Czy = Cyz=0, 
€z2=4q2t+3dqi/dx. 

The invariant J is 

(7.6) 


In order to apply the variational principle (6.10) we 
must integrate J over the volume. We first integrate 
along the thickness of the plate and obtain 


dq2 h fdqi 2 
“Tien ( ) + +O (Ft0), (7.’) 
—h/2 


with B=('/12)(20+R). We then integrate with 
respect to x 


l h/2 B 'sdqz\? 
s=f ax f las=— f pi dx 
0 —h/2 27%) \ dx 
d 2 
= f (4+) dx. (78 


If we assume that 6g; and dg. are zero at the end points 


[= $0 we = do rlestOslez. 
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=0 and «=/, the variation of J is 


1 Age 1 /dqu 
y=—Bf —agetz+Oh [ | (<+2)ae 
0 dx? 0 dx 
aq, dq2 
_ —+—)s Je 7.9 
S arr 


If a force f is applied to the surface of the plate per 
unit area in the z-direction, the virtual work of this 


force is 
l 1 
f fiwdx= f fogqidx. 
0 0 


Applying Eq. (6.10) of the previous section, the varia- 
tions (7.9) and (7.10) must be equal. The expressions 
multiplying 6g; in (7.9)'and (7.10) must be equal and 


(7.10) 
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that multiplying 6g2 in (7.9) must vanish. We derive the 
differential equations: 


a? 2 d 1 @ 1 d 2 
B= +a), on(—+*) =—f. 
dx 


(7.11) 
dx? dx? dx 


Eliminating g2, we find 


oF 


dx! B Qhdx? 


(7.12) 


The first term on the right-hand side corresponds to a 
bending deflection while the second term corresponds 
to a shearing deformation. We must remember that 
the differential equation (7.12) is also an operational 
equation in the time variable since B and Q are time 
operators. It is therefore also an integro-differential 
equation. 
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Thermal Ionization and Capture of Electrons Trapped in Semiconductors* 


HERMANN GUMMEL AND MELVIN LAx 
Physics Department, Syracuse University, Syracuse, New York 


(Received October 27, 1954) 


The use of Coulombic wave functions rather than plane waves for the free electron states is found to 
increase the calculated rate of capture of electrons by a factor of about 200 at liquid helium temperatures. 
Results calculated for shallow traps in Ge and Si are now found to be consistent with the upper limit set on 
the photoconductive lifetime by the experiment of Burstein, Overly, and Davisson. The Born-Oppenheimer 
and Hartree approximations used in our calculations were found to yield identical results at low tempera- 


tures in these materials. 


HE thermal ionization of an electron from a 
trapped state into the conduction band has 
previously been calculated using plane (or Bloch) 
waves for the final state.~* Such a procedure neglects 
the effect of Coulomb attraction on the final state 
which should be represented by a Coulomb wave func- 
tion. It is well known that the correct wave function 
has a density at the origin higher than the plane wave 
by the Sommerfeld factor y/[1—exp(—¥) ] with y=22/ 
(ka), where k is the propagation constant of the final 
state and a is the effective Bohr radius of the electron 
in the crystal. It is important to note that during 
ionization the major contribution to the total transition 
probability comes from final states with ka<1, so that 


* Supported in part by the U. S. Air Force, through the Office 
of Scientific Research of the Air Research and Development Com- 
mand and in part by the Office of Naval Research. A preliminary 
report of these results was presented on March 19, 1954 to the 
Detroit meeting of the American Physical Society, Phys. Rev. 
94, 1419(A) (1954). 

‘Goodman, Lawson, and Schiff, Phys. Rev. 71, 191 (1947). 

?Paul J. Leurgans, thesis, University of Illinois, 1952 (un- 
published). 

®R. Kubo, Phys. Rev. 86, 929 (1952). 

‘L. Tewordt, Z. Physik 137, 604 (1954). 


the correction factor is always large. At room tempera- 
ture the factor is about 50, at He temperature it is 
about 200. On capture, only electrons with ka<1 have 
large capture cross sections. And in any case, only elec- 
trons of thermal velocities are present, which at most 
temperatures of interest have ka<1. 

We have made a calculation of the ionization and 
capture probabilities in Si and Ge using Coulombic 
wave functions. The calculation was done both in the 
Hartree and the Born-Oppenheimer approximation. 
Results were applied to the case where the highest 
phonon energy is larger than the ionization energy, i.e., 
one-phonon processes are possible. Then both approxi- 
mations yield the same result if multiphonon processes 
are unimportant, i.e., at low temperatures. The Hartree 
approximation in first order gives one-phonon processes 
only at any temperature, while the Born-Oppenheimer 
approximation gives many-phonon processes at higher 
temperatures. This result is independent of the par- 
ticular mechanism considered for the interaction as 
long as one restricts oneself to an interaction potential 
that is linear in the displacements of the atoms from 
their equilibrium positions. We considered two types of 
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interaction of the electron with the lattice, the Bardeen- 
Shockley® deformation potential and the interaction 
used by Goodman, Lawson, and Schiff! which depends 
exclusively on the motion of the impurity atom. With- 
out committing ourselves as yet to a particular mecha- 
nism, we consider an interaction potential of the form 


V=N“Bi(1,2)qe(), (1) 


where q;(*) is the amplitude of the mode of propagation 
vector + and of type ¢. Denoting the matrix element of 
B,(r,2) between the final and initial unperturbed elec- 
tronic states (@ for initial state, k for free state with 
propagation vector k) by B;(*)xa and with a corre- 
sponding notation for the matrix elements between 
like states, we obtain® for the transition probabilities 
to state k and from state k: 


Weo= hr f exp[iExat/h+/() HO +Le) Phat, 


Wor=Wr* f expl—iBsat/h+ fOMM+LRO PA 


f= @(6) (Bux—Baa)?/(2hMa*)), (2) 
h(t)= ((®(t,w) + (20+ 1) ](hw/2M) 
X[hBxa/(Ex— Ez) P), 
g(t) = (®(t,w)ABra(Bux— Baa) 2Mw(Ex— Ea) }°), 
(tw) = (+1) exp(twt) +7 exp(— wt) — (2%+1). 


Here the angular braces ( ) demote an average over the 
modes of one type and a summation over all types. 
The indices ¢ and ¢ have been suppressed in B, 7=mean 
quantum number and w, the frequency of the mode in 
question. M is the mass of one Ge atom. Ey and £, are 
the unperturbed final and initial electronic energies 
and Ex.q is their difference plus the energy change due 
to the shift of the equilibrium positions of the atoms. 
To obtain the ionization probability Wion we have to 
integrate Wx. over all final free electron states, while for 
the capture probability Wp we take the average of Wa, 
over the thermally distributed k-states of the electron. 
If the exponential containing f(t) is expanded the 
integration over ¢ can be performed immediately, giving 
rise to delta functions. At low temperatures in Geand Si, 
only the first term in this expansion, representing one- 
phonon processes, is important.’ If, in the neighborhood 
of k=0, the matrix element | B;(*)xa|? is proportional 
to k*, we obtain at low temperatures: 


W ion= FN GPA (T) exp(—I/koT)/ (645M 1), 


W cap = FQ/(8x2M1), 


5 J. Bardeen and W. Shockley, Phys. Rev. 80, 72 (1950). 

® For the methods used consult M. Lax, J. Chem. Phys. 20, 
1752 (1952). For a review of the literature see M. Lax, The In- 
fluence of Lattice Vibrations on Electronic Transitions in Solids, 
Proceedings of the Nov. 1954 Atlantic City Conference on Photo- 
ductivity (John Wiley and Sons, Inc., New York, 1955). 

7 The same conclusion was obtained by Y. Yafet for transitions 
between bound states (private communication). 
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where 


F=>, f nrnt| Bi() val 8LZ/h—cor(2) lee, 


A(T)= f expl — E(k)/koT ]dk, 


P=A(T)" f } exp[—E(k)/koT Uk, 


a= (h)>. 


Here Q is the volume of the unit cell and J the ioniza- 
tion energy. If the mechanism causing the transition 
is the deformation potential, the B are given by 


B,(1,2) = (i/V2) Ei (e:(2)-*) exp(ie-r), (4) 


where the e;(*) are unit polarization vectors. It is ade- 
quate to use a Debye spectrum (longitudinal speed 
of sound=v) in the present calculation. To get an 
order of magnitude result it is probably adequate to 
use a scalar effective mass m*. For the deformation po- 
tential interaction, we thus obtain® 


W ion= 32QE 2 (watM Iv)“ (r9a)-42m*a*h-koT 
Xexp(—J/koT), (5) 


Weap= 256r'NE?(VaMIv)— (79a) 
X (2m*a*h-koT)-*, (6) 


where a is the effective Bohr radius= Bohr radius times 
dielectric constant times ratio of electronic to effective 
mass; 70 is the propagation constant of those modes for 
which hw(7o)=JZ, and V is the volume of the crystal. 
These equations apply if koT<«I and koT<K(kA—l), 
where ko is the Debye energy. 

For n-type silicon, using as ionization energy J~0.04 
ev and an interaction constant E,==15 ev,’ we obtain 


VWeap™(40/T)!X 10-8 cm*/sec. 


If we assume 10° minority carriers (acceptors) per cm’ 
to be present and have one donor and one acceptor 
level only, the lifetime of free electrons at 4°K is of 
order 3X 10-8 sec. This is consistent with experiments 
on the photoconductive lifetime by Burstein, Oberly, 
and Davisson™ who find an upper limit to the lifetime 
of 10- sec. If one uses the interaction of Goodman, 
Lawson, and Schiff,’ the value for the capture proba- 
bility at 4°K is smaller by a factor of about 17. 

8 Details of the above calculations will be submitted to this 
journal shortly. 


® This value for the interaction constant is obtained by using 

the formula (see reference 5): 
_ 4(a heh’ Me? 
B*3Q(m*)2 (oT PE” 

taking «= 1200 cm*/volt-sec [M. B. Prince, Phys. Rev. 93, 1204 
(1954)], and letting (m*)-5!2= (mymoms) 4X4 (1/mi)+(1/m) 
+(1/ms)] with m= m2=0.19m; m3=0.98m. 

10 Burstein, Oberly, and Davisson, Phys. Rev. 89, 331 (1953). 
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Photodiffusion Current Hall Effect: Transient Behavior 


L. H. Hatr* 
Physics Department, University of Illinois, Urbana, Illinois 
(Received November 15, 1954) 


Illumination at the surface of a semiconductor sample generates a photodiffusion current, and a magnetic 
field applied transversely to the flow develops a Hall potential. Observation of the transient behavior 
shows a voltage peak preceding the steady state value. A theoretical account of this behavior is given 


under simplifying assumptions. 





INTRODUCTION 


T the illuminated surface of a semiconductor 
photogenerated electron-hole pairs form and 
diffuse away. A magnetic field applied transversely to 
the flow produces a Hall effect. Experimental observa- 
tion and mathematical analysis of the steady state Hall 
potential under these conditions have been reported by 
several people.! Recently Bulliard? has observed the 
transient behavior which occurs; he finds a quick 
initial rise in the Hall potential and a subsequent fall- 
away to the steady state level. A time-dependent 
solution is here developed, which exhibits this behavior. 


MATHEMATICAL FORMULATION 


We treat the case of a semiconducting plate of 
infinite area and finite thickness, uniformly illuminated 
at one face by a beam normal to and absorbed at the 
surface. In the event of unequal hole and electron 
mobilities, inequality in the » and m distributions 
results and a compensating electric field E, develops. 
We assume that the recombination rate for pairs is 
linear in the minority concentration, an approximation 
generally considered adequate for small or large, but 
not intermediate, signals. To be specific, let the semi- 
conductor be -type and the direction of the beam be 
labeled y. 

Under these provisions the equations of flow, con- 
tinuity, and Poisson within the interior are 


Jp=—eD,VptepupE,, (1) 
Jn=eDnrVn+e(notn)unE,, (2) 
dp/dt= — (1/e)V-Ip— fp, (3) 
On/dt= (1/e)V-Jn—fp, (4) 
V- E= (4e/k)(p—n). (5) 


The subscripts » and m refer to holes and electrons 
respectively; » and m are the photoproduced excess 
concentrations of carriers. If the initial concentrations 


*John Simon Guggenheim Memorial Foundation Fellow. On 
leave from Santa Barbara College of the University of California, 
1953-1954. 

1P. Aigrain and H. Bulliard, Compt. rend. 236, 595 (1953); 
Moss, Pincherle, and Woodward, Proc. Phys. Soc. (London) B66, 
743 (1953); J. Frenkel, Physik. Z. Sowjetunion 8, 185 (1935); 
I. K. Kikoin and M. M. Noskov, Physik. Z. Sowjetunion 5, 586 
(1934); H. Dember, Physik Z. 32, 856 (1931); 33, 207 (1932). 

*H. Bulliard, Phys. Rev. 94, 1564 (1954). 


are po and m, the total concentrations are po+/ and 
no+n, with po=0 for an n-type semiconductor; u is the 
mobility, J the electric current density, f the specific 
bulk recombination rate (1/f=7, the bulk lifetime). 
The equations of continuity and flow may be combined 
to 


0p/dt= Dy0"p/dy"— Up (pE,)/dy— fp, (6) 
dn/dt= D,0’n/dy?+ 0 (nE,)/dy— fp, (7) 
where we have availed ourselves of the fact that the 
problem is one-dimensional. We make the assumption 
that the difference (p—m) is small and negligible com- 


pared with » or n. Then, upon addition of (6) and (7), 
we can neglect the electric field term, getting 


(upturn) dp/dt= (UnDpt+upDn)0*p/d¥— (uptun) SP, 
or 
dp/dt= Dd’ p/dy’— fp, (8) 
D= (UnDptepDn)/(uptHn)- (9) 


With the origin placed at the bright face, the boundary 
conditions are: 
For y=0, ‘>0, 


where 


(1/e)J py=I—of, 


where J= the effective photon intensity, e= the surface 
recombination velocity. At the dark face, y=a, ‘>0, 


(1/e)J py=aop. 
Transient phase: 

We now assume further that the conduction term in 
(1) is small compared with the diffusion term. This 
condition should obtain at least during the early period 
of the transient, particularly if the illumination is not 
too strong. In the numerical example evaluated later 
the voltage peak occurs at time ¢/7~0.1, and for 
illumination such that the steady state ratio of induced 
to native conductance is as high as 3.6, the ratio of 
conductance to diffusion term, averaged over the thick- 
ness, is at this time ~20 percent. For weaker illumi- 
nation such that induced and native conductances are 
equal, the ratio is ~6 percent. Under the above 


assumption 
J p(~— eD 0/0, (10) 


and the boundary conditions become at y=0 and y=a@ 
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respectively, 
8p/dy— (o/D,y)p+I/Dp=0; 0p/dy+ (c/D,)p=0. (11) 


For simplicity we have taken the case where the surface 
recombination velocities are the same at the two faces. 
Initially, 


p=n=0, I=0. (12) 


Let a uniform magnetic field be applied perpendicu- 
larly to the light beam. We now seek the Hall field EZ, 
required to maintain the carrier flow in the original 
y-direction. To first order the current density compo- 
nents in the x-direction are 


J pz= CpupE s+ Up) pyB, (13) 
J nz=e(not p)bupEz— bpd nyB. (14) 


Eliminating E, between (1) and (2) and utilizing the 
approximation (10), we arrive at J;,y= —bJ py, applying 
to the early transient period. The last term in (14) may 
therefore be written +°u,Jp,B. Since p,n and —Jpy 
are functions of y (decreasing) so also are the Hall field 
E, and Jy, and J,;. Under the conditions of the 
Bulliard? experiment there is an ohmic contact across 
each end of the plate; thus the Hall field appropriate 
to the experiment is one which maintains the total 
current in the x direction (rather than the current 
density) equal to zero: 


f "Extiipl p(1+8)+-meb}dy 


0 


‘ f TnftpB(L+B2)dy=0. (15) 
0 


Further, the potential measured is in the nature of an 
average along the direction y. We take as the average 
value of E, that constant value E, which satisfies 
Eq. (15). EZ, can then be taken outside of the integral 
sign and an expression for it readily obtained. While 
the symmetry assumed has been that of a plate of 
infinite area, we suppose the end effects small in a 
finite plate of sufficient area. The average potential 
difference between the two ends, of separation /, is 
then V=/E,, and its magnitude is 


_ BUA+8)ueDj[P0)— (a) 
noebuya+e(1+b)upforpdy 


In (16) account has been taken of the relation 





a Op 
J: Tyyly~—eD, [ —dy. 
0 dy 
It is seen that V is of the form 


V<[p(0)—p(a) //(Got+G:) ; (17) 


thus V is directly proportional to the average carrier 
concentration gradient and inversely proportional to 
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the total conductance (native conductance plus induced 
photoconductance). 


Steady state: 

In the interests of simplicity we limit discussion to 
two cases: (a) weak illumination, pmo; (b) strong 
illumination, p>>m. For the steady state of course 
J ny=—Iny, Which with (1) and (2) leads in case (a) 
to Jpy= —eD,0p/dy just as in the transient state and 
requires merely that in Eq. (16) for V we replace the 
quantity (1+) by (1+). In (b) however we are lead 
to J py= — (2b/1+6)eD,0p/dy, and for the correspond- 
ing value of V we have in (25), (26) to replace the 
parameters 8 by 6’=2b8/(1+5), y by y= (b+1)y/2b 
and in (16), (1+?) by 20. 


SOLUTION BY LAPLACE TRANSFORMATION 


We consider now the solution of the equations for 
carrier concentration p. The system we have to solve is 
Eq. (8) with boundary conditions (11) and initial 
condition (12). We make a Laplace transformation on 
t, multiplying Eq. (8) by e~* and integrating from 0 
to ©, to get 

@p/dy’—Np=0, (18) 


where 
oo 


“4 petidt, = (s+f)/D. (19) f 
0 


The transformed boundary conditions are: 
For y=0, #>0: 


dp/dy—kp+I/D,S=0, 
with k=a/D,. 
For y=a, {>0: 
dp/dy+kp=0. 
The solution of the transformed system is 


p=C\ sinhAy+C? coshry, 
with 
I(A sinhAa-+-k coshda) 
Dsl (2+?) sinhda+2kd coshia] 
I(A coshAa-+ sinhda) 


7 Dys(A?-+R?) sinhda+ 2kd cosh\a]} 








where we choose the positive root of (19). With the 
help of hyperbolic identities, (22) then reduces to 


__ I[Acoshd(a—y)-+k sinh (a—y) ] a 
site DysL(A?-+2?) sinhda+2kd cosh\a] 





By the Laplace-Mellin inversion theorem, 


atte 
p=— pe*'ds. 
2miW’ +i 


(24) 


The integrand is actually a single-valued function of 
and the integral can therefore be evaluated simply by 
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(18) 


(19) § 
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residue integration. The integrand has poles at s=0 
and at the zeros of the bracketed factor in the denomi- 
nator of #. When s=0, the residue multiplied by 277 is 


I{sinh[ (a—y)/L]+8 cosh[(a—+)/L]} 
o{ (1+62) sinh(a/L)+26 cosh(a/L)} ’ 


where L=(D/f)! and B=D,/Lc. 

The zeros of the bracket are pure imaginary, 
\=Fia,/a, n=1, 2, ---, where the a, are roots of 
the equation 





a/y—7/a—2 cote=0, 


and y=ak=ao/D,. We need only the positive roots 
for \. The residues associated with these roots multiplied 
by 2m are 


—2I7rD 


, en Pntlr 
aD, =! 


an{an coslan(a—y)/a]+y sinLon(a—y)/a]} 
Val (anr—y?—2y) cosan+2(y+1)an sinan} y 


where vn= { (Lan/a)?+1}. 
Then, in abbreviated form, 


= (25)+ (26). 


Evidently expression (25) represents the steady state 
value of p, in agreement with the expression derived 
by Moss, Pincherle, and Woodward? except for their 
term taking account of penetration of the illumination. 

Upon evaluation of p(0)—p(a) and /o*pdy, the 
solution for the average Hall potential (16) for small ¢ 
may be written 


TBID,(1+8)| s-—§ Fye-mtlt 


aD p n=l 





(26) 


V= 





nob 1(+0){5°——— 5 PF, le-vntlt 


n=1 
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Fic. 1. Hall potential vs time—theoretical curve. 


3 Moss, on” and Woodward, Proc. Phys. Soc. (London) 
B66, 743 (1953). 
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Fic. 2. Hall potential vs time—experimental curve (Bulliard). 
The curves in Figs. 1 and 2 are not directly comparable as the 
parameters are not identical. 





where 
1 Bcosh(a/L)+sinh(a/L)—B 
o (1+6?) sinh(a/L)+26 cosh(a/L)’ 
_L B sinh(a/L)+cosh(a/L)—1 
o (1+6?) sinh(a/Z)+26 cosh (a/L)' 
ticle anLy Sinan—an(1—cosan) | 60) 
Val (an?—y?— 27) CoSan+2(y+1an sinan | 
sin an[sinan+y(1—cosan)/an | en 
Val (anr-—y?—2y) cosan+2(y+1)an sinan | 
NUMERICAL APPLICATION 














Figure 1 displays the Hall potential curve computed 
from (27) for a germanium specimen somewhat com- 
parable to that used by Bulliard.? He estimates his 
parameters as a2=0.3 mm, 7-~600 psec, 20<0<309 cm 
sec!, For comparison there is sketched in Fig. 2 
Bulliard’s oscilloscope curve for an illumination with 
ratio of induced photoconductance to native conduc- 
tance G;/Go=2.6. Because of incomplete information 
on his parameters a direct check, however, -is not 
possible. We use 7 and a values above, «= 200 cm sec“, 
Gi/Go=3.6, Dp=40 cm? sec, b=u,/up™2 giving 
D=2bD,/(b+1)=53 cm? sec"!, L=(Dr)!=0.178 cm, 
B=D,/Lo=1.12, y=ca/D,=1.5. Only a few of the 
early terms in the two infinite series are appreciable, 
and when ¢/r=0.2 only two terms in each need be 
kept, thanks to the exponential damping factor. 

The theoretical curve exhibits the initial peak; the 
difference in the two curves may stem partly from the 
discrepancy in the parameters, partly from the approxi- 
mations. As remarked, our solution is intended for only 
the transient phase (or with stated modifications, for 
the steady state). We have, however, plotted this 
solution for extended time, finding that it still portrays 
the correct qualitative behavior. For strong illumination 
(27) shows that the voltage saturates. Further, when 
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a/L is small, the steady state saturation voltage is 
proportional to the surface recombination velocity o; 
when a/Z is large, the saturation voltage is independent 
of o but inversely proportional to the diffusion length L. 

Mathematically, the key to the initial peaking of 
the function V lies in the relation of the first two terms 
in each of the infinite series. In the numerator F2>F; 
and the v, increase progressively; hence the second 
term with high exponential damping dominates the 
series initially. Since the series is subtractive, the 
numerator grows quickly. In the denominator, on the 
other hand, F;’>F;,’, and the first term with lower 
damping dominates the series, also subtractive; thus 
the denominator grows relatively slowly. However, 
in a short time, only the first term in each series remains 
significant and since F,;'>>F, the denominator growth 
overtakes that of the numerator and brings the potential 
down toward a steady state value. 
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We have already seen in (17) the physically reason- 
able behavior: the Hall voltage developed is directly 
proportional to the current (through its proportionality 
to the concentration gradient) and is inversely propor- 
tional to the conductance. 

While the steady state Hall voltage may be used to 
measure the surface recombination velocity of a sample, 
as observed by Moss ef al.,* the dependence of the 
transient behavior on the specimen parameters appears 
to be too complicated to afford a method for their 
determination. 
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In order to calculate the average value of a physical quantity 
containing also many-particle interactions in a system of WN anti- 
symmetric particles, a set of generalized density matrices are 
defined. In order to permit the investigation of the same physical 
situation in two complementary spaces, the Hermitean density 
matrix of order k has two sets of indices of each k variables, and 
it is further antisymmetric in each set of these indices. 

Every normalizable antisymmetric wave function may be 
expanded in a series of determinants of order N over all ordered 
configurations formed from a basic complete set of one-particle 
functions yx, which gives a representation of the wave function 
and its density matrices also in the discrete k-space. The coef- 
ficients in an expansion of an eigenfunction to a particular operator 
may be determined by the variation principle, leading to the 
ordinary secular equation of the method of configurational inter- 
action. It is shown that the first-order density matrix may be 


N the nonrelativistic quantum theory of many- 
particle systems, the basic Schrédinger equation 
refers to a configuration space having a dimension pro- 
portional to the number of particles. Even if it is 
possible to find a solution with sufficient accuracy by 
aid of for instance modern electronic computers, this 
wave function is usually too complicated to provide a 
simple physical picture of the system. The aim of this 
paper is to give a discussion of the interpretation prob- 
* This work was supported in part by the U. S. Office of Naval 


Research under its contract with Massachusetts Institute of 
Technology. 


brought to diagonal form, which defines the “natural spin- 
orbitals” associated with the system. The situation is then partly 
characterized by the corresponding occupation numbers, which 
are shown to lie between 0 and 1 and to assume the value 1, only 
if the corresponding spin-orbital occurs in all configurations neces- 
sary for describing the situation. If the system has exactly N spin- 
orbitals which are fully occupied, the total wave function may be 
reduced to a single Slater determinant. However, due to the 
mutual interaction between the particles, this limiting case is 
never physically realized, but the introduction of natural spin- 
orbitals leads then instead to a configurational expansion of most 
rapid convergence. 

In case the basic set is of finite order M, the best choice of this 
set is determined by a form of extended Hartree-Fock equations. 
It is shown that, in this case, the natural spin-orbitals approxi- 
mately fulfill some equations previously proposed by Slater. 


lem, and we will show that it is possible to define a 
series of density matrices, which have a simpler and 
more direct physical meaning than the wave function 
itself. Dirac! has previously introduced a density matrix 
for describing a system in the Hartree-Fock scheme, 
where the total wave function is approximated by 4 
single Slater determinant, but the idea will here be 
essentially generalized in order to include the treatment 
of exact or approximate wave functions of arbitrary 

1p. A. M. Dirac, Proc. Cambridge Phil. Soc. 26, 376 (1930); 


27, 240 (1931). See also J. E. Lennard-Jones, Proc. Cambridge 
Phil. Soc. 27, 469 (1931), and V. Fock, Z. Physik 61, 126 (1930). 
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QUANTUM THEORY OF MANY-PARTICLE SYSTEMS. I 


forms. The number of coordinates will further be 
doubled in order to permit us to consider the same 
physical situation in two complementary spaces. 


1. DEFINITION OF GENERAL DENSITY MATRICES 


Let us consider a system of NV identical antisymmetric 
particles with the coordinates x1, X2, -::Xv moving 
under the influence of a fixed potential framework and 
their mutual interaction. Each coordinate x; is a com- 
bination of a space coordinate r, and a spin coordinate s;, 
and, in considering nucleons, we will include also the 
coordinate of the isotopic spin. The physical situation 
of the system is described by a wave function Y, which 
we assume to be normalized. It fulfills the antisymmetry 
condition 


PY (x1,X2,° ‘Xy)= (- 1) °Y(x1,X2,° “i *Xy), (1) 


where P is a permutation operator working on the 
indices of the N coordinates and its parity. In con- 
sidering the configuration space, we will let 


' (dx) = dada: + -dxy 


indicate integration-summation over all coordinates, 
(dx;’) the same procedure over all coordinates except 
x;, (dx;;') the same over all coordinates except x; and 
X;, etc. 

A physical quantity Q associated with the system is 
represented in the configuration space by a Hermitean 
operator 22, which is symmetrical in the indices of the 
particles. It may be expressed in the form 


1 1 
Qop= Qo) +L _ ty Qyt— Le Qetes+, (2) 
F Li | itk 


where each term is a zero-, one-, two-, three-, ---, or 
many-particle operator, respectively; the prime on the 
summation signs indicates that we omit all terms having 
two or more indices equal. In order to evaluate the 
average value of this quantity in a situation charac- 
terized by the normalized wave function Y, we will 
now introduce a series of density matrices of various 
orders: 


yun N f Y*(1'23- - WW (123---N) day’), 


T'(xy’x9" | X1X2) 


= N *(4/9 x10" 
-() fe (1/23. « -W)W(123- ++) dew’), 


T(?) (x)/xo/- ° *Xp' | XiXe" ° *Xp) 


= (1) ferns + ple) 


XW (123--+p- ++) (dais’s++p), 


T')(x,’x,/- . *Xy’ |X1X2° . *Xy) 
=W*(1'2/3’---N’)¥(123---N), 
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where, in the integrands, 7’ and 7 are abbreviations for 
the coordinates x,’ and x;,, respectively. These density 
matrices are Hermitean, they are antisymmetric_in 
each set of indices, so that 


T'(x1X2| X1/Xe’) = I'* (x1’xe’ | x1x2), a) 
4 


Fr (x2’x)’ | X1X2) =-—T (x1’Xo" | X1X2), 
and they are further related by the formula, 


TP) (x,/x2’- ° *Xp-1'|X1Xe° ° *Xp-1) 


ae 
> I?) (x 'x 4.0 eX 'x 
N+1—p 1 42 p—l of 
X1X2°° *Zp-15 yey. (5) 
Of special importance are the diagonal elements: 


(x1) =7(x1| x1), 
¥ (xi) =7(x1|x 6) 


I’ (x1,X2) =I" (x1X2| X1X2), 


which are all positive definite. Because of the antisym- 
metry of each set of indices, they are symmetric in 
their coordinates. The diagonal elements have the fol- 
lowing physical interpretations: y(x:)dv1=number of 
particles X the probability for finding a particle within 
the volume dv; around the point r; having the spin 5, 
etc., when all the other particles have arbitrary positions 
and spins; I’ (x1X2)dv,;dv.= number of pairs X the prob- 
ability for finding one particle within the volume dv, 
around the point r; with the spin s,, etc., and another 
within the volume dv. around the point rz with the 
spin se, all others having arbitrary positions and spins; 
etc. According to (3), we obtain for the total integrals 


fr (x:)da.=N, fr (x1X2)dx\dx2=N(N— 1)/2, 


N 
fren + +Xp)dxrdxo: - a,=( ). 
p 


Since the matrices (3) are antisymmetric in each set 
of their indices, they will vanish identically if two (or 
more) indices of a set are equal. For the diagonal ele- 
ments, we obtain in particular: 


T®) (x1,X2,X2) =0, eiiaies (8) 


(7) 


r (x1,X1) ms 0, 


which shows that, for small distances, the antisym- 
metry requirement leads to a correlation effect which 
will strongly keep particles with parallel spins apart. 
This general phenomenon, which is an important con- 
sequence of the Pauli principle, was first noticed for free 
electrons as the ‘Fermi hole.’” 

Let us illustrate the calculation of average values of 
physical quantities by considering the two-particle 


2 See, for instance, E. Wigner and F. Seitz, Phys. Rev. 43, 804 
(1933); and J. C. Slater, Phys. Rev. 81, 385 (1951). 
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operator term in (2). By changing names of the inte- 
gration variables and using the symmetry properties, 
we obtain 


fear anna 


=1N(N—1) ; Y*0, 20 (dx) 


“ {W*(1/2/3-- +N) Qo 
()f 


XV (123- - -N)}21'=21, xe’ =x9d2d%2(dx12’) 
= f (ur Quix! nx) eran, xo! =290% 0X9. 


In treating the density matrices, we will introduce the 
convention that the operators 2;; will work only on 
the unprimed coordinates x,, x;, etc., but mot on x,’, 
x;’, etc., and that, after the operations have been 
carried out, we have to put x;/=x;, x;/=x;, etc. We 
note that the diagonal elements of (3) are sufficient for 
describing the physical situation in the ordinary 
x-space, but that we need the nondiagonal elements for 
characterizing the situation also in complementary 
spaces, as the momentum space. For the operator (2), 
we obtain in this way 


(Qop) w= J W*2,,¥ (dx) 
= 0.94 f Ser(a ladder 
+ four (x1'Xe! | x1X2)dx1dx2 


+ f Qj] (X1/Ke’Xs’ | XiX0X3)dx1dxedx3+---. (10) 


In order to illustrate the use of this fundamental 
formula by a few examples, let us first consider an 
electronic system (atom, molecule, or crystal) without 
external field at absolute zero. The system has the 
following basic Hamiltonian: 


é p? 
ye SE Liveat EHF’ Zaft} 
2 oh i (2 9 


m 


HED! e/risy (11) 
i] 

where Z, is the atomic number of the nucleus g. Here 

we have neglected relativistic effects (including all 

spin couplings) and the zero-points vibrations of the 
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nuclei. According to (10), the average energy is now 
given by the expression. 


2 1 
(Hop) v= — 2, Z,Zn/ iat f pry (x1' | 41) dx 
2 oh 2m 


~e LZ, f wits ef lis 


Tig Ti2 


(12) 


where the first term is the repulsive Coulomb potential 
between the nuclei, the second the kinetic energy of the 
electrons, the third the attractive Coulomb potential 
between the nuclei and the electrons, and the last term 
the repulsive Coulomb potential between the electrons, 
For the description of the energy of such a system, it is 
therefore sufficient to know the second-order density 
matrix I (x;’Xx2' | x:X2), from which the first-order density 
matrix may be calculated according to (5). 

Let us then consider the total spin S? measured in 
units # of a system of NV antisymmetric particles. Ac- 
cording to Dirac,* we have 


S= Pi S-S; 
=—N(N—4)/4+20i'(1+0;-0;)/4 
=—N(N—4)/4+4 3; Pi, 


(13) 


where P;;” is the operator for permuting the spin coor- 
dinates s; and s; of the particles 7 and 7. Applying 
formula (10), we obtain 


(S?)w= —N(N—4)/4 
+ f I'(r151,82S2| £182,82s1)dx1dx%2, (14) 


which expression may be evaluated from the knowledge 
of the second order density matrix. 

As a last example, we will consider the operator for 
the electric moment D, 


D=e ar ry. (15) 


According to (10), its average value for a particular 
situation is given by the diagonal elements of the first- 
order density matrix: 


(Dyw=e f rov(as)da (16) 


In this connection we observe that we shall some- 
times need quantities which are related to the transition 
of the system between two orthogonal states, I and II, 
which are characterized by the normalized wave func- 
tions W; and Wy. In analogy to (3), we will for this 


£.?P. A. M. Dirac, Proc. Roy. Soc. (London) A123, 714 (1929). 
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purpose define the éransition matrices: 
yr 11(X1’|X1) 


f W1*(1'23- - -N)W11(123- - -N) (dx), 
(17) 


Tr 11(X1’Xe" | x1X2) 


a (3 f Uy*(1/2/3- + War (123- + -W) (deus), 


and, in the same way as before, we can derive the 
formula 


(| Gop | 11) = f W1*Q,,V 11 (dx) 
= f on 11(X1’ | x1)da1 


+ fours 11(X1/Xe! | x1X2)daidxe+:--. 
For the transition moment, we obtain, in particular, 


(I| D|11)=e f is seal (19) 


We observe the simplification of the physical picture 
of the system provided by the use of the density ma- 
trices (3). In considering a physical quantity (2), con- 
taining many-particle operators up to the order k, the 
average value of this quantity is determined by (10) 
and the density matrix of order k, from which all 
density matrices of lower orders may be evaluated suc- 
cessively by using (5). For k=2, we obtain for 
instance 


(Sop) m 


N 
f| Q+N art ‘ ) ae} (x1’xo! | x1X2)dx1d22 





» (20) 
f T'(x1X2)dxdx2 


where, in agreement with the convention introduced 
in connection with (9), we have to put x:’/=x, and 
X:'=xe after the operations in the integrand in the 
numerator have been carried out; the denominator is 
introduced in order to take care automatically of the 
normalization. 

The density matrices (3) may be derived from the 
wave function W or from the matrix of highest order 
k=N. It would also be of some interest to investigate 
the reverse problem and to see how much the knowledge 
of a lower-order density matrix (k< NV) would determine 
the wave function, i.e., the physical situation of the 
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system. From (10) it is clear that the average values of 
all physical quantities (2) containing only many-par- 
ticle operators up to the order & are entirely fixed. The 
eigenfunctions of such an operator fulfill the relation, 


(21) 
and may also be derived from the variation principle, 
5(Qop) av — 0, (22) 


which leads to a variation condition for the density 
matrix of order k; compare (20). In this and following 
papers we will discuss these problems in some detail, and 
we will show that these preliminary results are quickly 
changed if we impose also additional restrictions on the 
form of the wave function. It will, for instance, be 
shown that, in the Hartree-Fock scheme where the 
total wave function is approximated by a single Slater 
determinant, the first-order density matrix (xi'|x:) 
alone determines all the higher-order matrices, the 
wave function, and consequently the entire physical 
situation. In the part of our present electronic theory 
of atoms, molecules, and crystals, which is based on the 
Hartree-Fock approximation, the first-order density 
matrix is therefore an appropriate tool for giving a 
simple physical picture of the system. In the following, 
we will largely concentrate our interest on the proper- 
ties of the general first-order density matrix, and we 
will investigate its behavior also in the higher approxi- 
mations. 


QpV=WY, 


2. ANALYSIS OF THE PROPERTIES OF THE 
DENSITY MATRICES 


(a) Expansion Theorem 


In order to investigate the properties of the density 
matrices in greater detail, we will introduce an ortho- 
normal and complete set of discrete* one-particle func- 
tions y,(x) (k=1, 2, 3, ---) of such a type that every 
normalizable function (x) of a single coordinate x may 
be expanded in the form 


$@=Cevi(le, o= f V(oxVi*(mr)dxy. (23) 


Following Slater,® we will here include all spin proper- 
ties explicitly in the wave functions, and the one- 
particle functions ¥(r,s) are therefore spin-orbitals, 
obtained by multiplying two complete orthonormal 
sets of orbitals (being functions only of r) by the spin 
functions a(s) and 8(s), respectively. In considering 
nucleons, we include the isotopic spin functions in the 
same way. From the very beginning, we are going to 
make ourselves free from the idea of “doubly filled 
orbitals,” and the two sets of basic orbitals associated 


4 For the sake of simplicity, the set is here chosen discrete, but 
there are no major difficulties in extending the treatment to 
include also continuous sets. 

5 J. C. Slater, Phys. Rev. 34, 1293 (1929). 
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with the ordinary spin must therefore not necessarily 
be the same, but the orbitals in one set may, of course, 
be expanded in the orbitals of the other set. These 
distinctions will later be of value in treating correlation 
properties. 

By introducing a set of spin orbitals ¥,(x;) for each 
coordinate x; (i=1, 2, ---N) and by successively ap- 
plying (23), we may now expand every normalizable 
function V in configuration space in the following form: 


W(x1,X2,°°°Xv)= DL ur(Xi)Wee(Xe)--- 
kikg-+-ky 


zn (Xw)C (Ri, Re, «+ Rw), 
Xin (xw)C( ) (25) 


C(ki,ke,- . sky) = f vcr: . -N)ui* (1) 
X Wes" (2)- + -Yen*(N) (dx). 


For antisymmetric functions fulfilling (1), it follows 
from (25) that also the coefficients C are antisymmetric 
in their indices: 

PC(hi,k2,- > kw) =(—1)?C (hi,ke,-+-kw). (26) 
A selection of N indices ki, ke, ---kyw will in the fol- 
lowing be called a configuration, and the space described 
by all values of these indices will simply be called the 
k-space. In the terminology of the transformation 
theory, the antisymmetric quantity C(k1,ke,: + -kw) may 
be considered as the representation of the wave function 
in the k-space, and we note.that it fulfills the nor- 
malization condition 


= |C(kiks---ky)|2= f Iw|2(dx). (27) 


kike---kn 


Because of the property (26), the number of inde- 
pendent coefficients C in expansion (24) may be essen- 
tially reduced, for instance by referring the indices to a 
specific order. If a selection of N indices hi, ke, ---kw 
fulfills the condition ki<ke<---<ky, it will in the 
following be called an ordered configuration and will be 
denoted by the abbreviated symbol K. In this connec- 
tion, it is also convenient to introduce the symbol 


Cx= (NV !)'C (Ri,ke, - . Rw), (28) 


for then the normalization condition (27) takes the form 


ExlCel*= f ly |2(dx), (29) 


where we have to sum only over the ordered configura- 
tions K. 

The quantities Cx represent all independent coef- 
ficients in expansion (24). By permuting the dummy 
indices ki, ke, ---kw and by using (26) and (28), we 
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may now rearrange this expansion in the following way: 
W(x1,X2,° + -Xy) 
=(N!)7* DL Pe Lo Wer (X)We2(Xe)-- 
kike-++-kn 
Xen (xw)C (Riko: oe ky) 
=(N!)" YS C(Rike- + -kw)>-(—1)?Px 
kika-+-kn r 
X Wer (X1)We2(Xe2) - - Yew (Xw) 
C (hike: + -ky) det{perWu2,* > Yen} 
= Dx CxWx(X1,X2,° + +X), 


. - 


ki<ke<---kn 


(30) 
where 
Wx (X1,X2,°*-Xw)= (W!)! det {Yur,Wro,: > -Yen} 


is the normalized Slater determinant belonging to the 
ordered configuration K. Hence it is possible to expand 
an antisymmetric wave function in configuration space 
in a series of Slater determinants over all ordered con- 
figurations K. 

Two ordered configurations K=(hi,ko,---kw) and 
L=(l,,l2,---lw) are said to be the same if they are 
identical in all their indices, and they are said to be 
different if they differ in at least one index. It is easily 
shown® that two Slater determinants Vx and Wz be- 
longing to two different ordered configurations K and 
L are orthogonal, and hence we have 


(31) 


[vers = OxL. (32) 


The Slater determinants associated with all ordered 
configurations form therefore an orthonormal set, which 
is complete with respect to normalizable, antisymmetric 
functions in configuration space. The coefficients Cx 
may be derived from (30) and (32) or from (25) and 
(28), which gives the two expressions 


Cr= f V(12---N)Wx*(dx) 


= (vt f (12: as*(tMvaat(2)--+ 
Van (V) (da. 


(33) 


The connection between them is discussed in greater 
detail in Appendix I. 

We may now also expand the density matrices (3) 
in a similar way. For the first-order matrix, we obtain 


y (xa! |X1) = Does Wa* (x1) (x1) (2), (34) 


where, according to (4) the coefficients form an Her- 
mitean matrix: y(k/)=~*(Jk). For a density matrix of 
order p, which is Hermitean and antisymmetric in each 


© See Eq. (39). 





QUANTUM THEORY OF MANY-PARTICLE SYSTEMS. I 


set of its indices, we obtain in general: 


T¢) (x1'Xe"+ + +Xp' | XiX2" + *Xp) = a Wir (X1’) + + -Wap* (Xp) (Xr) + + Wrp(Xp)P™ (Lila: « -Lp| Rika - +p) 


Rika-++ 
Liles + slp 


Wei*(x1’) +++ 


heey Waa" (Xp) 


Vep* (x1’) 
Vip" (Xp) 


The expansion coefficients I‘ (J,,l2,-+-lp|ki,ke,** Rp), 
which are Hermitean and antisymmetric in each set of 
their indices, may be considered as the representations 
of the density matrices in the k-space, and now it 
remains to investigate how these densities depend on 
the wave function C(k,k2,---kw), i.e., to derive the 
relations in k-space corresponding to the definitions (3). 


(b) Density and Transition Matrices 
for Slater Determinants 


In order to derive the general expressions for the 
density matrices in k-space, we will first consider the 
transition matrices associated with two Slater deter- 
minants U and V: 


U=(N!)-} det{u1,u2,--- uy}, 


(36) 
V =(N!)—! det{v1,02,- + - vv}, 


which are built up from two basic sets of spin-orbitals 
i, U2, ***UN, ANd 2%}, V2, -**¥y. For the sake of com- 
pleteness, we will not impose any orthogonality con- 
dition on the sets #% and , and we will further assume 
that they have mutual “nonorthogonality”’ integrals 


duo(H)= f ust asoias)dny (37) 
which may be different from zero for k/, If there is 
no risk for confusion, we will often in the symbol d,, 
omit the indices ~ and ». 

By using formula (109) in Appendix I, we obtain 


foves= f ty* (x1) U2* (Xo) > + -w* (Xn) 
X det{v1,v2,---vw} (dx) (38) 
=> p(—1)?Pd(1l;)d(2l2)- - -d(Nly) 
=det{d(kl)}, 


which shows that the “nonorthogonality” integral of 
two Slater determinants U and V equals the deter- 
minant Dyy of all the “nonorthogonality” integrals 
d,,(kl) associated with the two sets of one-particle 
functions involved : 
UV (ax) =Dov=det{dus(tD}. (39) 
The determinant Dvyy is of basic importance for the 
following discussion, and we need it as well as its minors 


+ Wip(X1) 
Vip(Xp) 


T'?)(Ijlo- + +Ip| Riko: «+ Rp). 


oe 
ee (35) 


¥u(Xp) 





of various orders: 


Dov(k|D, Dov (Rike| Lite), Dov (Rikoks|tilels), «++. (40) 


These minors are originally defined only for ordered 
sets ki<ko<--+<ky and 1,</1,.<--+<ly, but they are 
easily generalized to the total (k,/)-space by assuming 
that they are antisymmetric functions in each sets of 
their indices. 

In the following we need also the minors of the deter- 
minants in (36), which will be denoted by symbols of 
the type 


det,,(12-+-p| Riko: +-R,), det»(12++-pllsle---1,). (41) 


The minors of order p are determinants of order (V—), 
and we note that, according to (39), they fulfill the 
relation 


Cww—p) Ih f det,*(12-+ + pl Rake: --Bp) 


X det, (12: ° - P| lile- ‘ -Ly) (dx’ 12...p) 


= Dov (hike: - Rollie: -+L,). (42) 


In order to derive the first-order transition matrix 
associated with U and V and defined by (17), we will 
expand the determinants (36) in terms of their first 
rows: 


U (xiXxe° . ‘Xv)= (V!)-3 De ux(X1) det,,(1 | k), 
V (xiX2- + -Xw)= (N!)-? S 7 02(x1) det, (1]7). 
By using (42), we then obtain 


Nf U*(x'se - +Xw)V (xixe: + -xXw) (dxy’) 


= Der Me* (x1')or(x1) [(V—-1)!F* 


x f det,*(1]&) dets(1|2) (day’) 


= Dar m*(x1)rr(x1)Duv(k|D). (44) 


Taking the normalization of U and V into account, we 
finally get 
Yuv (x1’|x:1) a (DuvDyv)} De uy* (x1’) 


X02(x1)Duv (k | L) . (45) 
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In order to derive the transition matrix of order ~, we will use Laplace’s theorem’ and expand the deter. 


minants (36) in terms of their first p rows: 


U(xixe--xy)=(V!I? 


Bi<k2<: «kp ) 443 (Xp) 


011(X1) 
V (xixe'-xy)=(V!)-$— 


h<h<--blon(x,) ++: 


According to (17) and (42), we then obtain 


n1(X1) 


Ur p(X1) 
det,,(12--+p| ike: --Rp), 


Urp(Xp) 


Vtp(X1) 
det, (12- 2 - pl hile: . *f,). 


V1p(Xp) 


N 
& ) forevrn + *Xp/Kppit + *Xw)V (X1X2° + *XpXpp* * *Xw) (dX12...p') 


unr* (x1’) Up* (X1') 
= (p !)-1 = . . . . . . 
AIS bolas") 


h<he<- Unp* (Xp’) 


011(X1) 


011(Xp) 


V1p(X1) 
+ |CW-—)h 


V1p(Xp) 


x f dets*(12--+p| Rika: +-h,) dety(12-+ + p|alo- + «Z,) (data2sp") 


Uni*(X1') +++ Mep*(X1’) 
=)" aegis Bie Gagihics 
ki<ke<:-- 2 Unai* (x,’) 


h<ik<-- Unp* (Xp) 


“o)* 2 


where, in the last form, we are using the generalized 
minors defined in the entire (k,/)-space by the antisym- 
metry requirement. Observing the normalization of U 
and V, we get therefore for the transition matrix of 
order in the (k,/)-space: 


Luv?) (lilo: + -Lp| Rika: » -Rp) 
= (p "Dov (Rike oe kp| Lyle eee 1,) (DuuDvv)“. 


According to (18), we are now able to find the matrix 
element of an operator 2.) with respect to two Slater 
determinants: 


(48) 


fer %.0¥ (a) 
=Qe@Dovt 2k | 21|1}Duv (kD) 


1 
va X {Rike| M12 Lile} Dov (Rike| Lile) 
! kike 


1 
t+— DY {hikoks|Q125| Lilels} 
! hikes 


X Dov (Rikeks| hills) +---, (49) 


7See, for instance, G. Kowalewski, Determinantentheorie (Veit 
& Company, Leipzig, 1909) 


011(X1) 


011(Xp) 


Vip (X1) 
; Dovilide: + Ellie +> 


Vip (xp) 


Unr* (X1’)* + * Uep*(Xp’)0(X1)** *Vip(Xp) Dov (Rike: - *Ry| lile- - +15), 





where we have used the matrix notations 


{2|93|2) = f eM Oinke ddr, 


{ kike|Q12| Lite} = fms (x1) a#x2* (x2) 
X Q12071 (x1) 012(X2)dax dae, 


This is the general formula® for nonorthogonal basic sets 
um, and 2. The corresponding formula for the orthogonal 
case was first derived by Slater,® and the nonorthogonal 
case has then been discussed rather extensively in the 
literature. We note that the formula for the diagonal 


8 A preliminary report of this result was given in P. O. Léwdin, 
Quarterly Progress Report of the Solid-State and Molecular 
Theory Group at Massachusetts Institute of Technology, January 
15, 1952 (unpublished), p. 10. For some simplifications in the 
present derivation, the author is indebted to discussions with Dr. 
A. Meckler, Massachusetts Institute of Technology. 

J. C. Slater, Phys. Rev. 34, 1293 (1929); 38, 1109 (1931); see 
also E. U. Condon, Phys. Rev. 36, 1121 (1930). 

1” J. C. Slater, Phys. Rev. 35, 509 (1930); J. E. Lennard-Jones, 
Proc. Cambridge Phil. Soc. 27, 469 (1931), particularly p. 480; 
D. R. Inglis, Phys. Rev. 46, 135 (1934); H. M. James, J. Chem. 
Phys. 2, 794 (1934); J. H. Van Vleck, Phys. Rev. 49, 232 (1936); 
R. Landshoff, Z. Physik 102, 201 (1936); G. H. Wannier, Phys. 
Rev. 52, 191 (1937); B. H. Chirgwin and C. A. Coulson, Proc. 
Roy. x) (London) A201, 196 (1950); W. J. Carr, Phys. Rev. 9, 
28 (1953). 
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elements (U=V) in the nonorthogonal case, previously 
given by the author," was derived in an entirely dif- 
ferent way. 

In the special case when Dyy0, a considerable sim- 
plification may be introduced in (49), for, according to 
a well-known theorem in the theory of determinants,’ 
we then have 


Dov (Riko: id *Ry|Lile- ” -Ly) 


Dov (Ralls) 
Se Peinoer 
Dov(Rp|li) ++: 


qd“ (1 ik 1) 


Dov hi|lp) 


Dov (kp|lp) 


d (1yk1) 
mDoyi + + + + + «+ «|, (58) 
qd (likp) pre d (Ipkp) 
where d-!(Jk) is the inverse matrix to the matrix d(kl), 
defined by (37). It may be shown that, in this case, all 


transition matrices may be expressed in the fundamental 
invariant 
Doar M* (X1’)02(x1)d* (1k), 


and that, except for a factor, all higher-order matrices 
may be expressed as determinants of the first-order 
matrix. This case will be discussed in greater detail in 
a following paper. 


(52) 


(c) General Properties of the Density Matrices 


We are now ready to discuss the general properties of 
the density matrices (3) and their representations in the 
k-space. If we let the symbol (%) denote all ordered con- 
figurations K containing a specific index k, the symbol 
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(kik2) denotes all ordered configurations K containing 
a specific pair of indices k; and ke, etc., then we may 
rearrange our summations by the formulas 


(k1k2) 


SFers, 


K kiko kike K 


(k) 


K 
LL=L2, 
Kk k&K 


(53) 


We will assume that our normalizable wave function 
W may be expanded in a series of Slater determinants 
Wx over all ordered configurations K: 


according to (30). Applying (45) and (53), we obtain 
for the first-order density : 


v(x’ |x) =D Cr*yx1(x1|x1)C1/D|Cx|? 
KL K 


(54) 


KL 
=D CKu*Cr DL Wa (x1’)i(x:) 
KL k ol 
XDrx(k|D)/L|Cx|? 
K 
&) 
=D ve (xr Wile LD Cx* 
kl K L 
XDxrlk|)C1/X|Cx|*, (55) 
K 
or, for the first-order density in the k-space: 
&) 
YUR=L LU CK*Dei(k|NCr/L|Cxl’. (56) 
K L K 


Similarly, by using (48) and (53), we obtain for the 
density of order p: 


T“?) (xy/Xq! + + Xp! | XiXe° Xp) =o Ca*T xr’? (x1'Xa! + Xp! |X1X2° * *Xp)Cz/|Cx|? 
KL K 


T > Wer* (x1) .o ‘Pep* (Xp W(X) ‘ ‘Wip(Xp)T (his: ‘ -1,| hike: Rp), 


hike: ++kp 
les + lp 


(57) 


(Rika: + Rp) (Lila: + +lp) 


T)(Iyly-+ Ly [Riko -ke)=(P) SX Cx*Der (hike: + pl hile -Iy)CL/X Cr? 
K L K 


This formula gives the density matrices in k-space 
expressed in the wave function C(k1,ks,---kw) or its 
independent elements Cx, and it corresponds therefore 
to the definitions (3). 

We note that the density matrices in k-space are 
Hermitean and antisymmetric in each set of their 
indices. By using (7) and (57), we find for their total 


values 
De v(k|k)=N, 


N 
> 1 (bal aay) =( ), (59) 
p 


kike-+-kp 


"P.O. Léwdin, Arkiv mat. astron. fysik A35, No. 9 (1947); “A 
Theoretical Investigation into Some Properties of Ionic Crystals” 
(thesis) (Almqvist & Wiksells, Uppsala, 1948); J. Chem. Phys. 
18, 365 (1950). 


(58) 





which shows that the normalization is correct. The 
diagonal elements ‘ 


v(k)=y(k|k), U(hi,ko) =T (hiks| hike), 


may be interpreted analogously to the diagonal elements 
in x-space: y(k)=number of particles X the prob- 
ability for finding a particle in the spin-orbital k when 
all the other particles occupy arbitrary spin-orbitals; 
I'(k1,%2) =number of pairs X the probability for finding 
one particle in the spin-orbital &; and another in the 
spin-orbital k2, when all other particles may occupy 
arbitrary spin-orbitals; etc. 

However, even the nondiagonal elements of the 
density matrices in k-space may have a physical 
meaning. Taking over a terminology from quantum 


(60) 
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chemistry developed by Coulson and Longuet-Higgins,” 
we will call y(k|%) the charge order of the spin-orbital 
k and the coefficient y(/|k) for kl the bond order of 
the two spin-orbitals & and /, hence associating the 
product of two spin-orbitals in (55) with a “bond” 
between them. The first-order density matrix y(/|%) in 
k-space is therefore also called the charge- and bond- 
order matrix. Similar concepts may be introduced also 
for the higher-order densities in k-space, and we note 
that the second-order density I'(J;/2| ki:k2) correlates the 
charge and bond orders of two particles and maximum 
four spin-orbitals. 

Due to the antisymmetry of each set of indices in the 
density matrices, we can conclude that, if two indices 
in the same set are equal, then the corresponding ele- 
ments vanish. For the diagonal elements, we obtain 
in particular 

T'(k1,k1) =0, T'(ki,k2,ke) =0, (61) 
showing that the probability for two particles to be in 
the same spin-orbital vanishes identically. This con- 
sequence of the antisymmetry requirement is an ex- 
pression for Pauli’s exclusion principle in k-space. 

Let us now discuss the properties of the diagonal 
elements (60) in greater detail. Since our basic set 
¥x(x) of one-particle functions is assumed to be ortho- 
normalized (d.:=6;:), the only nonvanishing elements 
in the basic determinant Dxz, defined by (39), appear 
for pairs (k,/) referring to the sage spin-orbital occurring 
in both ordered configurations K and L (d,:=1). Due 
to the ordering of the indices, these elements 1 may 
occur anywhere in the determinant, but, by inter- 
changing rows and columns in a suitable way, they may 
be brought to the diagonal, which procedure changes 
the value of the original determinant and its minors 
only by a sign factor + or —. If K and Z are different 
ordered configurations, the diagonal contains also one 
or more elements which are zero, and in general, we 
therefore obtain the relation 

Drx(kike- > -Rp| Riko: * ky) =5Kx. (62) 
Substituting this expression into (56) and (58), we get 
finally for the diagonal elements (60) : 


y()=¥|Cxl'/E|Celt, 


(63) 


(Rike:- 


T'(?) (kiko: + +p) = = "Icnl/E lea 
K 


in agreement with the interpretation of C as a “wave 
function” given before. However, we note that, since 


#2 C. A. Coulson and H. C. Longuet-Higgins, Proc. Roy. Soc. 
no A191, 39; 192, 16 (i047); 1 193, 447, 456 (1948); 195, 188 
1948). 


PER-OLOV LOWDIN 


the quantities |Cx|* are all positive definite, (63) and 
(64) lead to the inequalities 

O<7(B)S1, OST (Rika) <1, (64 
showing that the charge order of a specific spin-orbital k 
lies always between 0 and 1, and that it can assume the 
value 1, only if the spin-orbital & occurs in all ordered 
configurations K, which are necessary in (54) for de- 
scribing the total wave function characteristic for the 
physical situation under consideration. Similarly, the 
“combined charge order” for the group (f:,ks,:--k,) 
lies always between 0 and 1, and it can assume the 
value 1, only if the group (f:,ke,---k») occurs in all 
ordered configurations K necessary for describing the 
situation. 

The charge order y(k) may be interpreted as the 
average number of particles in the spin-orbital & in the 
physical situation under consideration; see also (59). 
Since the inequalities (64) are essentially depending on 
the antisymmetry requirement (1), this condition has 
here deeper consequences than the Pauli principle in 
its “naive” formulation, which considers only the occu- 
pation numbers 0 or 1. This problem will be further 
discussed in a following section. 


3. METHOD OF CONFIGURATIONAL INTERACTION 


In quantum mechanics we are particularly interested 
in finding the eigenvalues of the Hermitean operators 
Q.» corresponding to physical quantities, i.e., in solving 
the equation 


Q V=Wwy. (65) 
In order to discuss this problem, we will assume that 
the eigenfunction W exists and is normalizable. We will 
further introduce a complete orthonormal basic set of 
one-particle functions or spin-orbitals y, (k=1, 2, -::). 
According to (30), the solution may now be expanded 
in a series of Slater determinants Y; over all ordered 
configurations K = (ki,ke,---kw) with ki<ke<+ ++ <ky: 


v= vx WxCx, 
Wx=(N!)-4 det{PerWis,-: 


(66) 
‘Pen}. 


According to (28), the coefficients Cx are the inde- 
pendent elements of an antisymmetric wave function 
C(ki,ko,:++Rw) in k-space. 

Every normalizable wave function WY may be ex- 
panded in the same way, and, for the average value of 
Q.» with respect to such a wave function, we find there- 
fore 


(ap) f YM (4) / fovea 


=P Cx*(K |p| L)C1/X Cr*5x1C 1, 
KL KL 


(67) 
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where, according to (2) and (49), we have 


(K | Qop| L)= f td 


K L 
=%oDertX L{k|%|2}Dax(kl)) 


+(2!)> xy ¥(iake| 12|Lil2} 
X Drx(kike| lls) 


a 
+(3!)7 YY {Rikoks| Qos ls/2/s} 
kikakg lilels 
XDxx(kikoks|lills)+---, (68) 


with the matrix notations 


{||} = f Va* (x1) Qiyr(xi)der, 


{Riko| Q12| Zito} = f Wir (X1)Wao* (X2)Qre 
Xi (x1)Wie(X2)dardxe, 


In order to determine the coefficients Cx, ie., the 
wave function in k-space, we will now apply the varia- 
tion principle (22) to expression (68). This leads to a 
system of linear equations 


Yi1{(K | Qp| L)—WSx1}Cr=0. 


The condition for solubility is given by the secular 
equation 


(70) 


(71) 


which determines the eigenvalues W. The values of Cx 
may then be determined from the system (70), which 
may be considered as the representation of the eigen- 
value problem (65) in k-space. 

The many-body problem (65) is in this way reduced 
to a form which is essentially the same as in the one- 
particle problem; the Eqs. (70) and (71) are in both 
cases infinite. The method of “configurational inter- 
action” is therefore in principle simple, but the ana- 
lytical or numerical work necessary for evaluating the 
matrix elements (68) and for solving the Eqs. (70) and 
(71) is certainly still formidable. However, during the 
last few years, the work in several research groups has 
shown that it is practically possible to tackle the 
numerical problem of solving secular equations (70) 
of comparatively high orders by means of the modern 
electronic computers, and one can expect a steady 
development of the methods of programming, etc. In 
this connection there are also two principal problems 
which have been put in the foreground, namely, firstly, 


det{(K | Qp|_L)—Wsx1} =0, 
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how to determine the basic set of one-particle functions 
in such a way that the series (66) obtains as rapid con- 
vergence as possible, and, secondly, how to get simple 
physical interpretations of the complicated total wave 
functions derived in this way. In the next section, we 
will show that the theory of density matrices is useful 
for treating both these problems. 


4. NATURAL SPIN-ORBITALS AND THE CONVERGENCE 
PROBLEM IN THE METHOD OF 
CONFIGURATIONAL INTERACTION — 


The many-particle problems, which have been solved 
with the greatest accuracy up till now, are connected 
with the theory of electronic structure of atoms, mole- 
cules, and crystals. In their treatments of atoms and 
molecules, Boys and Meckler and others have used 
the method of configurational interaction in an approxi- 
mate form, and they have overcome the numerical 
difficulties by aid of electronic computers. However, 
their preliminary results are then rather complicated 
wave functions in configuration space, and one is still 
looking for simple physical interpretations. In this con- 
nection, we would like to point out the importance of 
the first, second, and higher order density matrices (3). 

Let us start by considering only the first-order density 
matrix 7(x;'|x:), derived from the wave function ac- 
cording to (3) or (55). The corresponding matrix + (/| k) 
in the k-space, i.e., the charge- and bond-order matrix, is 
Hermitean, and it is therefore possible to find a unitary 
matrix U which transforms this matrix to diagonal form 
with the eigenvalues m.= mm: 


(72) 


We have further, in matrix form, y= UnU', and, if we 
introduce a new set of spin-orbitals x, by the matrix 
relation x= 4U, or 


Xt= Da Va ak, 


we may rewrite the density matrix in the form 


UtyU=n=diagonal matrix. 


(73) 


¥ (X1’ | X1) = De maxi” (x1') xe (x1). (74) 
This form is characterized by the fact that all bond 
orders are vanishing, and the new spin-orbitals x; will 
therefore be called the natural spin-orbitals associated 
with the system and state under consideration. The 
corresponding charge orders m, which are the eigen- 
values of the matrix y(/|), will be interpreted as their 
occupation numbers, since they represent the average 
number of particles in each one of the natural spin- 
orbitals. We note that, if two or more charge orders 
are the same for spin-orbitals of the same spin type, the 
corresponding orbitals form a degenerate group, and 
+ (x:’|x:) is then invariant against unitary transforma- 
tions of the orbitals within such a group. 


"aS. F, Bo F. Bo "306, 489 Roy. Soc. (London) A200, 542 (1950); 201, 


125 (1950); 489 (1981); 207, 181, 197 (1951); etc. 
MA, Meckler, J. Chem. Phys. 31, 1750 (1953). 
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According to (64) and (59), the occupation numbers 
fulfill the two conditions 


O<m<1, Cem=N, (75) 


and we can therefore conclude that the particles must 
be distributed over more than NV natural spin-orbitals 
with a limiting case, when they are occupying exactly 
N spin-orbitals. The condition for the limiting case may 
be expressed in the form 


r=7, 


where Tr(=trace) means the formation of the diagonal 
sum, for y(/|%) has then exactly N eigenvalues equal 
to 1 and the remaining zero. If, in such a case, we would 
choose the natural spin-orbitals as our basic set, all 
configurations in expansion (66) must contain the fully 
occupied spin-orbitals, i.e., this expansion is reduced to 
a single Slater determinant. This would mean that, 
provided the necessary existence and convergence 
theorems for the solution W are fulfilled, the relation 
’= 7 in k- or x-space would form a sufficient condition 
for the possibility of reducing the total wave function 
to a single determinant, i.e., for the strict validity of 
the Hartree-Fock approximation. Our conclusion, 
which is based on Eqs. (63) and (64), is the reverse to 
a theorem previously shown by Dirac.? 

It is well known that, in a system where the particles 
show mutual interaction, the Hartree-Fock approxi- 
mation is usually not strictly valid, and this means 
that, by the effect of this interaction, the occupation 
numbers are depressed below 1: 0<m<1. The cor- 
responding Cayley-Hamilton equation for the matrix 
is then more complicated than the first relation (76). 

We note that the antisymmetry requirement (1), 
which leads to the first condition (75), is here more 
general than Pauli’s exclusion principle in its original 
form, which considers only the occupation numbers 0 
or 1 and therefore explicitly must refer to the Hartree- 
Fock approximation. We note that part of the im- 
portance of the Hartree-Fock scheme depends on its 
physical simplicity and visuality connected with the 
fact that some changes of the system, as ionization’® 
and excitation, may be described as resulting from 
entire particles jumping from occupied to unoccupied 
spin-orbitals or to infinity. In this scheme, the natural 
spin-orbitals are identical with the ordinary Hartree- 
Fock functions, being undetermined on unitary trans- 
formations of the two groups of orbitals, associated 
with different spin types. Already at this stage, the 
numerical computations involved are extremely labo- 
rious, but, by aid of the modern electronic computers, 
it seems now possible to reach beyond this approxima- 
tion. In a more exact theory, the circumstances are 
certainly more complicated,!® since the occupation 


Tr(y)=N, (76) 


18 T. Koopmans, Physica 1, 104 (1933). 

16 The same complications will also occur, for instance, in an 
exact electron-positron theory, which is based on Dirac’s original 
idea of a fully-occupied vacuum. 
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numbers may lie between 0 and 1, and ionizations and 
excitations of the system are then accompanied by 
changes of the numbers m by fractions of 1 with pos- 
sible changes also of the nondiagonal elements m,,. 
However, in a following paper, we will show that it is 
possible to preserve some of the simplicity and visuality 
of the Hartree-Fock scheme even in more exact treat- 
ments using the method of configurational interaction. 

Let us now turn to the convergence problem con- 
nected with the expansion (66) after ordered con- 
figurations K. It could happen that the arbitrarily 
chosen basic set y;, is inconvenient for its purpose, and 
the convergence of (66) is then correspondingly slow. 
In order to investigate the effect of introducing natural 
spin-orbitals x:, we will now carry out the matrix trans- 
formation Y= xUt, or 


Vi — Toss Xa Vax. 


By using a theorem’ for expanding a determinant of a 
matrix being a product of two rectangular matrices, we 
obtain the following transformation for the basic Slater 
determinants Vx and Xz: 


We=>1X1Arx, Xr=(N!)-* det{x1,x2,°-:xy}, 


where 


(77) 


(78) 


Ut (hki) Ut (kw) 


Atx= (79) 


Ut (lh) Ut (lykw) 


By putting this formula into (66), the total wave 
function may instead be expanded in determinants X, 
over all ordered configurations L of the natural spin- 
orbitals x;: 


W=Do1.X1(Lix ArxCx). (80) 
In contrast to (66), we could call (80) the natural 
expansion of the total wave function. 

Its convergence properties may now be understood 
from the relations (63), (64), and (74). In the limiting 
case, when exactly N natural spin-orbitals are fully 
occupied and the relation y*= y is fulfilled, the natural 
expansion (80) is reduced to a single Slater determinant. 
In considering the convergence, this is of course the 
most favorable case. However, if only a finite number of 
the occupation numbers m, in (74) are essentially dif- 
ferent from zero, the natural expansion (80) will be 
reduced to a sum of determinants over all ordered con- 
figurations associated with these essentially occurring 
spin-orbitals, i.e., to a sum of comparatively few terms. 
The introduction of natural spin-orbitals seems there- 
fore to provide a simple solution of the convergenct 
problem, previously discussed by Slater.!” 


17J. C. Slater, ymin Progress Report of Solid-State and 


Molecular Theory Group at M.LT., 6, January 15, 1953 (unpub- 
lished); Technical Report No. 3, 39, February 15, 1953 (ur 
published); Phys. Rev. 91, 528 (1953). 
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Note added in proof.—tIt is desirable to have also a more exact 
mathematical measure for the rapidity of convergence of the two 
configurational interaction series (66) and (80). We note that, 
according to (60) and (63), the charge order y(k) gives the 
probability for the ordinary spin-orbital y; to occur in the expan- 
sion of the total wave function ¥. If only M of the numbers 
y(k), k=1, 2, 3, +++, are essentially different from zero, then the 
number of essential terms in (66) is given by the corresponding 
number of possible configurations: M. i. !(M—N)/. In using this 
procedure, however, it is necessary to evaluate the individual 
quantities y(k) and to distinguish between essential and unessen- 
tial charge orders. 

A still simpler measure of convergence may be constructed by 
observing that the charge orders always lie between 0 and 1 and 
that, in the limiting cases y(&)=0 and y(k)=1, the corresponding 
spin-orbital y; occurs in none or in all of the terms in (66), respec- 
tively, without contributing to the slowing down of the con- 
vergence of the series. The eventual slowness of the convergence 
of (66) depends instead on the possibility for an electron to be 
distributed over two or more spin-orbitals, giving charge orders 
of an intermediate order of magnitude, O<yth) <i. The rapidity 
of convergence of (66) may therefore be measured by the small- 
ness of the quantity 


d= (1/N) Ze{1—y(k)}y(k)=1— (1/N) Zl )¥, 


which fulfills the inequality of O<8<1. In considering different 
basic sets ¥1, We, Ys, --* for the description of the same total wave 
function ¥, it is clear that the natural spin-orbitals x, are char- 
acterized by having the smallest 3 value possible. According to 
(72), we have y= UnUt and y= Un’U, leading to Tr(y*)=Tr(n*) 
and 


Zeve= Zine— RA lver|?< Zane, 


with the final result 
1—(1/N) ZynP<i—(1/N) Zev’, 

which proves our theorem. This means that the natural spin- 
orbitals are distinguished not only by having vanishing bond 
orders but also by giving the smallest number of essential charge 
orders possible. By investigating the quantity #, one can therefore 
easily estimate how much improvement one can expect in the 
convergence of a given configurational interaction series by intro- 
ducing the natural spin-orbitals. 

The quantity nat for the natural spin-orbitals themselves may 
also be expressed in the form 


Saat = (1/N) Tr(y—7*), 


and we note that, provided the necessary existence and conver- 
gence conditions are fulfilled, the relation 


Prat bd 0 


is a necessary and sufficient condition for expressing an arbitrary 
antisymmetric wave function in the form of a single determinant. 
The necessity follows from Dirac’s theorem (see reference 1) 
¢@’=0, and, in order to prove the sufficiency, we note that from 
Yut=0 it follows that Dyn~(1—nx.)=0 with OSn;,S1, and that, 
since this sum does not contain any negative terms, this relation 
can be fulfilled only if 2,=0 or 1. Combined with the normaliza- 
tion condition Tr(y)=N, this means that exactly N natural 
spin-orbitals are fully occupied each by one electron, and, accord- 
ing to the first relation (63), the antisymmetric wave function 
must then be expressible as a single determinant built up from 
these spin-orbitals. The deviation from zero of the single number 
Sout tells us also how far our wave function ¥ is from the Hartree- 
Fock approximation. (Received January 24, 1955.) 


By the transformation (77), even the higher-order 
densities may now be expressed in the natural spin- 
orbitals, but we note that, unlike the first-order density, 
they are usually not brought to diagonal form. As an 
example, we may consider the second-order density 
matrix in the limiting case, when exactly N spin- 
orbitals are fully occupied, i.e., the relation (76) is 


fulfilled. According to (74) and (51), we obtain 
y (X1’ | x1) = Dee xe" (K1’) xe (x1), 


T'(xy'xe! |x1x2) =} y(x1’|x1) -y(x1’| xe) 
1 Xe | X1X2) = 2 7 (xe’ |x:) (x2! | x2) 


=} Dual xu® (X1') xe (x1) x7* (Ke!) x1(K2) 


— xn (X1')x1(1)x1* (Xo) xe (X2)}- 


(81) 


In the last term in the expansion for I’, there are cross 
products of the form x4*(x1’)x:(x1) which, for k¥/, lead 
to the existence of the well-known exchange effects 
associated with each pair of natural spin-orbitals x; 
and x:. Higher-order densities may be treated analo- 
gously. The corresponding expansions in the general 
case (74) are slightly more complicated, but there are 
no principal difficulties in deriving them. 

We note finally that the diagonal elements of the 
secondo-rder matrix have previously been used suc- 
cessfully by, among others, Lennard-Jones!® in dis- 
cussing correlation properties between electrons in 
atoms and molecules. In case of symmetric wave 
functions Y, they have also been used by London’? for 
investigating the distance correlation in a Bose-Einstein 
gas. 


5. LIMITED CONFIGURATIONAL INTERACTION. 
EXTENDED HARTREE-FOCK EQUATIONS 


In the last three sections, we have assumed that the 
basic set ¥; of one-particle functions is infinite and 
complete. An arbitrary normalizable function F (x;'| x1) 
may then be expressed by the expansion 


F(x;'[x1)= = We* (x1')r(x1) Fe. (82) 


However, it is immediately clear that, in applications 
to particular problems, we must usually replace this set 
by a set of finite order M. Since the basic set is then no 
longer complete, we meet the problem how to determine 
this set in order to obtain a solution (66) to (65), which 
is as accurate as possible. In the case M=N, this leads 
to the ordinary Hartree-Fock problem, but, if M>JN, 
it leads to an extension of this scheme recently proposed 
by Slater.” 

Let us assume that M>WN and that our basic set 
(k=1, 2, ---M) is orthonormal, 


(83) 


f ViPide=du1, 


which imposes an orthogonality condition on the 


18 J. Lennard-Jones, J. Chem. Phys. 20, 1024 (1952); J. Len- 
nard-Jones and J. A. Pople, Phil. Mag. 43, 581 (1952). 

19 F. London, J. Chem. Phys. 11, 203 (1943). 

20 J. C. Slater, see reference 17. Compare also J. Frenkel, Wave 
Mechanics, Advanced General Theory Clarendon Press, Oxford, 
1934), pp. 460-462, who has treated the same problem by the 
method of second quantization. 
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orbitals belonging to the same spin type. We will 
further characterize our basic set by a projection matrix 
e defined by 


(84) 


p(X1,X2) > We* (x1)yx (x2). 


We observe, that, since o fulfills the relations 


f p(X1,€1)p(E1,X2)dé1=p(X1,X2), f p(X1,X1)dx1=M, 


or 


=e, Tr(e)=M, (85) 


it has really the character of a projection operator.” In 


the case M=N, it is identical with Dirac’s density - 


matrix,! but it must not be confused with this matrix 
for M>N. It is now no longer possible to obtain an 
exact expansion (82) of an arbitrary function F(x,’|x:), 
we have to be satisfied with the approximate form 


fas! a) =E Win WaCed Fa. (86) 


The function f(x:'|x1), defined by this interrupted ex- 
pansion, is said to represent the orthogonal projection of 
the function F(x,’|x:) on the subspace of the general 
Hilbert space, defined by the basic set y; (k= 1,2, ---M). 
We note the validity of the matrix relation 


(87) 


which shows the use of the projection operator o. For 
every function f(x:'|x:) which is expansible in the 
basic set ¥, i.e., which belongs to the subspace defined 
by this set, we have further 


f= oF g, 


f= of= fo= ofo. (88) 


Let us now again study the eigenvalue problem (65). 
In expansion (66) of the solution VY, both the coefficients 
Cx and the basic spin-orbitals y, (k=1, 2, ---M) are 
undetermined, and, in order to derive the best approxi- 
mation of the solution, we will apply the variation 
principle (22). According to (10), (57), and (58), we 
have 


(Qop) w= Qo+Le| 2; |]}y(2| 2) 


+D {hike| Q1o| Lilo} T (Lile| ike) 


+ ¥ {Rikoks| Q128| Lalels} 
kikoks 
Lilels 
XI (Ila) | Rikoks)-+ ++, 


where the matrix elements are defined by (69). Varia- 


(89) 


21 J. v. Neumann, Math. Grundlagen der Quantenmechanik (Dover 
Publications, New York, 1943), p. 41. 
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tion of the coefficients Cx leads to Eqs. (70) and (71), 
and variation of the basic set y; leads to extended Har. 
tree-Fock equations of the form 


u Qipr(xs)¥ (|) 


+2 u p> ro (2)Qrar2(2)dxapr(x1)T (Us| ke) 


+3 x p> Vas™ (2)Yes* (3) Qi2api2(2)pis(3)daredars 
ids 


X1(x1)T (als| Rkoks) +++ -=Dryi(xsA(l|R). (90) 


The quantities \(/|k) are here the Lagrangian multi- 
pliers associated with the orthonormality condition 
(83). Since spin-orbitals with different spins are auto- 
matically orthogonal, the multipliers \(/|%) may be 
different from zero only for spin-orbitals & and / asso- 
ciated with the same spin type. Since further the 
quantity 


E NCLAYS f Nd (01) 


must be real, we can conclude that the multipliers 
\(/|k) form an Hermitean matrix: \(k|/)=A*(1|). 

By multiplying Eqs. (90) for each k by ¥*(E1’) and 
by summing & from 1 to M, we may express the ex- 
tended Hartree-Fock equations in the more condensed 
form 


Qyy (Es | x1) + 2f our (E1’X2" | x1x2)dx2 


+3 f Qrosl" ® (E1’Xo'xs’ | x1xoX3)dxedxs 
++++=)(E'[x1), (92) 


where we have assumed that the operators 9, Qu, 
Q123, «+: etc. do not work on the variables &', x, 
x;’---. After the operations in the integrands have been 
carried out, we shall as before put all x,’=x,, whereas & 
may have an arbitrary value. The function \(&1'|x:) in 
the right-hand member is here given by the Lagrangian 
multipliers : 


NC!) = Vat(EsG)ACIA). (08) 


We note that, in Eq. (92), the left-hand member is 
entirely independent of M, and this form is therefore 
convenient for discussing the transition from M= 
to M=o. The function (93) may be considered as the 
“projection” of an arbitrary Hermitean function 
A(&1’|x:) on the subspace defined by the basic set of 
order M: 

»A= pAg. (94) 


It is apparently this relation which gives the essential 
condition for determining the best set of a finite order. 
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However, in the limiting case when M—> and the set 
tends to be complete, we have 


lim p(x1,X2) =6(xi—X:), (95) 


and relation (94) is then changed into the identity 
A=A. This means that, in the limiting case, the ex- 
tended Hartree-Fock equations (92) loose their meaning 
as a restraining condition on the basic set y, which may 
then be chosen arbitrarily, in agreement with our 
previous assumptions. 

Let us now turn back to the case of a finite order M. 
The quantities y(&:'|x:) and d(E,’|x:) in (92) are 
quadratic forms with Hermitean coefficients, and the 
question is whether we can bring them to diagonal 
forms. In the case when M=N,, the first-order density 
y(&1'|x1) is from the very beginning on diagonal form, 
and it is then possible to determine a unitary trans- 
formation of the basic set which brings also the matrix 
\(£:’|x1) to diagonal form. This is a conventional pro- 
cedure in the ordinary Hartree-Fock scheme, and the 
eigenvalues of the matrix A(/|) are called the orbital 
energies of the basic spin-orbitals ¥;. However, if M>WN, 
the first order density matrix y(£,’|x:) may be brought 
to diagonal form (74) first by introducing the natural 
spin-orbitals x;, and, only if several occupation numbers 
nm, are the same with a corresponding degeneracy in the 
spin-orbitals x,, we have any additional transformations 
free for changing the form of \(£1'|x1), too. In general, 
we cannot therefore expect that it should always be 
possible to bring y(&1’|x1) and (E,’|x1:) simultaneously 
to diagonal form. 

In order to consider the natural spin-orbitals in 
greater detail, we will start from (92) and rewrite the 
extended Hartree-Fock equations (90) in the form 


| Ye( Ex) Oey (Ey | xs) dE! 


+2 f We (Ex) Qo (Ex'xe" | x1xX2)dér/dxe 


+3 f We ( Er’) Qrogl" (Ey’Xo’Xs’ | X1X0X3)dE1'daedas+ --- 


= f vu (Er)A(Er’|x1)dé’. (96) 


Carrying out the transformation (77) to natural spin- 
orbitals x, and dividing by m0, we obtain 


Qixe (x1) +24 f Xe (Er) MT (E1’Xo" | x1X2)dEr/daxe 


+371 f xe (Ex) QiosT @) (E1’x2'xs" | X1X2X3) 


M 
Xdér'dxedast+-:: => xi(xi)A’(1|k)m, (97) 
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where 4’= UtaU. This is the exact integro-differential 
equation satisfied by the natural spin-orbitals, which 
previously are shown to lead to the most rapid con- 
vergency of the expansion (66). 


Connection with Slater’s Extension of the 
Hartree-Fock Equations 


In some recent work, which has appeared only in 
preprints, Slater’? has investigated the convergence 
problem in the method of configurational interaction, 
and he has intuitively proposed that the basic set of 
one-particle functions, which would lead to the most 
rapid convergence, should satisfy an extended form of 
the Hartree-Fock equations. Since we have here shown 
that this set satisfies (97), it is of interest to investigate 
the connection with Slater’s equation. 

Equation (97) may also be written in the form 


(91+ Vor(1)}xu(1)=¥ xu(t)(]Bmct, (98) 


where V,, is a rather complicated operator containing 
ordinary potentials as well as exchange operators. Since 
Vop does not commute with the coordinate x, these two 
quantities are usually not compatible. However, in 
order to obtain the connection with Slater’s approach, 
we will now replace V., by its “best approximation” in 
x-space: 

Vopxi(X1) © V (x1)xe(X1), 


k=1,2,---M (99) 


which may be defined by the condition that the sum 


M 
Kk | V opXk(X1) — V (x1)xe(X1) *, 


k=1 


(100) 


should be as small as possible. The quantities x, are here 
appropriate weights, and, for the natural spin-orbitals, 
it seems natural to choose them as being just the occu- 
pation numbers: «,= m. In this way, using the minimum 
condition, we obtain 


De MXK* (X1) Vopxe(X1) 


Da mexe*(x1)xu(K1) 


According to (74), the quantity in the denominator is 
just the first-order density y(x:|x:). Using (97) and 
(98), and observing the validity of the relations 





V (x1)= (101) 


f ocx esr Ci'x' X1X2)dE,/=T'(x1X0" | x12), 
(102) 


f p(X1,E1/)0 (E1’xXo’xs' | X1X0X3)dE1’ = (xixX2'Xs’ | X1X2Xs), 
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we get finally for the “average potential” V(x): 


V (x:) = 2 f our (x;’xe" | X1X2)dx2/y (x:) 


+3 f Oar (x1'X2'x,’ | X1X9X3)d%2dx3/7 (xi)+ deg (103) 


Equation (97) may therefore be replaced by the ap- 
proximate form 


(QV ()}ralm)=E wale) CLAM (104) 


Since the operator (H:+V:) is Hermitean, the same 
must hold also for the matrix )’(/|&)m;— in the right- 
hand member, which implies that, in this approxima- 
tion, there can be \’-couplings only between natural 
spin-orbitals having the same occupation number. 
However, each such group is degenerate and, by carry- 
ing out a suitable unitary transformation, we may then 
also bring the matrix \’(/|k)m-! to diagonal form. 
Instead of the rather complicated Eq. (97), we have 
now obtained an approximate representation in the 
form of an eigenvalue problem 


{Qi+-V (x1)} xe(x1) =cexe(21), (105) 


where V (x;) is the “average potential” given by (103). 

We are now ready to carry out a comparison with the 
extended Hartree-Fock equafions intuitively proposed 
by Slater.!” Since we are then mainly interested in elec- 
tronic systems, the basic Hamiltonian is of the specific 
form (11), with 0;=2/r:;, 0;.=0, <9 CLC. According 
to (103), we get for the “average potential” 


V (x1) = 2e f nnn (106) 
Ti2 


which is just the potential discussed by Slater. Our 
procedure, based on the minimization of (100), gives 
then a strict derivation of this potential for M>JN. 
Hence we obtain also a new derivation of Slater’s 
average exchange potential” in the ordinary Hartree- 
Fock scheme with M=N. 

As Slater has pointed out, the approximate form 
(105) is much more convenient to handle numerically 
than the exact Eqs. (97) containing exchange operators 
and leading to coupled integro-differential equations of 
a rather complicated type. Since the approximation 
also seems to be very good, as shown, e.g., by Pratt™ 
for the case of M=N, it seems feasible for most appli- 
cations to use (105) instead of (97). However, for 
investigating the convergence problem, it is not neces- 
sary to solve neither (105) nor (97), since this problem 
is now simply treated by the diagonalization (74) of the 


2 J.C. Slater, Phys. Rev. 81, 385 (1951). 
% G. W. Pratt, Jr., Phys. Rev. 88, 1217 (1952). 


first-order density matrix, leading automatically to the 
natural spin-orbitals. 


6. CONCLUSIONS 


In the case in which our basic set of one-particle 
functions is chosen infinite and complete, we have 
shown that, in principle, the fundamental problems in 
the many-particle theory may be solved in a simple 
way: the eigenfunctions to (65) may be determined by 
the method of configurational interaction, which leads 
to a system of linear equations (70) with a secular 
equation (71) for determining the eigenvalues, and the 
convergence problem may then be treated by the diag- 
onalization (74) of the first-order density matrix and 
the introduction of natural spin-orbitals. 

However, the discussion in the previous section 
shows that, if our basic set is only of a finite order M, 
the circumstances are much more complicated. The 
conditions for determining the best choice of this set 
of this order are nonlinear integro-differential equations 
of the type (92) or of the approximate form (105), which 
may be solved only numerically by successive approxi- 
mations by using the method of “‘self-consistent-fields.” 
In the case M=N, i.e., in the ordinary Hartree-Fock 
scheme, it is certainly worthwhile to try to carry out 
this numerical procedure, since the corresponding solu- 
tion has a physical simplicity and visuality of great 
importance. However, in the case M>N, it can be 
discussed whether it is worth the trouble to solve the 
complicated nonlinear equations (92) even in their 
simplified form (105). Instead it seems better to try to 
introduce an orthonormal set of a considerably higher 
order than M, where the limitation is given only by 
the capacity of the electronic computer or mathematical 
machine available, and to solve the algebraic secular 
equation (71) and the linear system (70). Afterwards, 
by transformation to natural spin-orbitals, one may 
then try to diminish the order of the basic set by taking 
only those spin-orbitals into account which have occu- 
pation numbers essentially different from zero. The 
number M of essential spin-orbitals, found in this way, 
is characteristic for the system and may serve for de- 
fining “‘closed shells,” etc., in a more exact theory. 

Our discussion could give the impression that it 
would be entirely meaningless to use any form 0 
extended Hartree-Fock equations in the method of 
configurational interaction. However, in a following 
paper, we will show that, in treating degenerate systems 
and correlation effects, it is possible to extend the ordi 
nary Hartree-Fock scheme for M=WN to include 3 
specific form of “fixed” configurational interaction 
based on the use of projection operators. The totel 
wave function is here defined as the “projection” of 4 
single determinant, and the basic set in this determinant 
of order M=N is determined by an ordinary Hartree 
Fock equation associated with a “composite” Hamil 
tonian, modified to take the degeneracy into propé 
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account. This form of “fixed” configurational interac- 
tion has the advantage that it is possible to preserve 
some of the physical simplicity and visuality of the 
ordinary Hartree-Fock scheme. 


APPENDIX I 
An Integral Formula 


Let Wo be an approximate (or exact) solution to the 
Schrodinger equation (21), which does not possess the 
correct symmetry property. Since 2, is symmetric in 
the coordinates, every function PW is then also a solu- 
tion of the same type, and the linear combination 


Was=(N!)-? Yp(—1)?PW, (107) 


summed over all V ! permutations P (p being the parity), 
has the correct antisymmetry character. This new wave 
function is simple to deal with in calculations, for, if ® 
is an arbitrary antisymmetric function obtained, e.g., 
by letting a symmetric operator work on an antisym- 


metric wave function, we obtain 


f Vas*b(dx)= (NW) E (1) f P*Y,*5 (dx) 
P 
= (NIE (1)? f Uo" P'6 (dx) 
=(N))! f Vorb (de), 


(108) 


since the sum over all P contains NV! identical terms. 
We get therefore the basic formula: 


(109) 


f Vas*b(dx)=(N!)-4 f Vorb (dx), 


which is of value in treating wave functions Wo built on 
simpler elements, as one- or two-particle functions. 
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Quantum Theory of Many-Particle Systems. II. Study of the Ordinary 
Hartree-Fock Approximation* 


Per-Otov Léwnpin 
Department of Physics, Massachusetts Institute of Technology, Cambridge, Massachusetts, and 
Institute of Mechanics and Mathematical Physics, Uppsala University, Uppsala, Sweden 
(Received July 8, 1954) 


A system of NV antisymmetric particles, moving under the influence of a fixed potential and their mutual 
many-particle interactions, is investigated in the ordinary Hartree-Fock scheme, having the total wave 
function approximated by a single Slater determinant. It is shown that all the density matrices of various 
orders, the wave function, and the entire physical situation depends only on a fundamental invariant 
p(x1,%2), which is identical with the first-order density matrix. The Hartree-Fock equations are expressed 
in terms of this quantity. 

The Hartree-Fock equations are also solved by expanding the eigenfunctions in a given complete set, 
and applications to the MO-LCAO theory of the electronic structure of molecules, and crystals are given. 
It is shown that, in this scheme, the entire physical situation depends on a charge- and bond-order matrix 
R(vm) with respect to the ordinary atomic spin-orbitals involved. The Hartree-Fock equations for this 
matrix are investigated. 

Finally, the ionized and excited states are investigated, and it is shown that the Hartree-Fock scheme 
has a high degree of physical visuality also in case of many-particle interactions. The excitation energy of 
the system is the difference (w;’—w;) between two “spin-orbital energies,”’ being eigenvalues to the effective 
Hamiltonians associated with the two states under consideration. 





N a preceding paper,! we have investigated the pos- 
sibilities for expressing the total wave function ¥ 
for a system of NV antisymmetric particles by a series of 
Slater determinants over all configurations of order N, 
formed from a basic complete set ¥y of one-particle 
functions or spin-orbitals. This basic set may have been 
arbitrarily chosen, and the convergence of the con- 
figuration expansion is then correspondingly slow. How- 
ever, if we introduce the ‘natural spin-orbitals x. 
diagonalizing the first order density matrix y(x:'|x:), 
we obtain the configuration expansion of most rapid 
convergence, which is directly connected with the 
convergency of the series 


Le m=WN, (1) 


where the occupation numbers m, fulfill the condition 
O<m <1. The NW particles are therefore always dis- 
tributed over more than N spin-orbitals, but, mathe- 
matically, there is a limiting case when exactly NV 
natural spin-orbitals are fully occupied, and the con- 
figuration expansion is then reduced to a single Slater 
determinant: 


W=(N!)-* det{x1,x2, xv}. (2) 


Physically, this wave function would have a particular 
importance since it is the simplest wave function based 
on the “independent-particle model” which has the 
correct antisymmetry property. However, in con- 
structing this wave function by antisymmetrizing a 
simple product, the mutual interaction between the 
particles is usually only partly taken into account, and 


* This work was supported in part by the U. S. Office of Naval 
Research under its contract with Massachusetts Institute of 
Technology. 

1P. O. Léwdin, preceding paper [Phys. Rev. 96, 1474 (1954)], 
in the following referred to as Part I. 


this means that the limiting case cannot have a physical 
reality and that the wave function cannot be exact. 

The many-particle theory based on the approximate 
wave function (2) is usually called the Hartree-Fock 
scheme,? and it represents the first important step 
towards a more exact theory of antisymmetric par- 
ticles. The scheme has been developed in great detail 
for the electronic structure of the atoms by Hartree’ 
and his collaborators, and a large part of the periodic 
system is now covered in their applications. It will cer- 
tainly take a rather long time, before a theory of similar 
accuracy has been developed for molecules and crystals, 
but the basic principles are well known and have been 
discussed by several authors; recent contributions to 
this field have been given by Mulliken,‘ Roothaan,’ 
Slater,* and others. 

In the previous discussions of the Hartree-Fock 
approximation, one has usually started from a basic 
set of N individual Hartree-Fock functions or spin- 
orbitals, but we will here emphasize another aspect of 
the scheme, namely that the properties of the system 
are dependent only on the first-order density matrix 


y(x1'| X1) (3) 


but independent on the individual spin-orbitals, which 
are used in forming this matrix. All the higher-order 
density matrices may be expressed in (3), and, in order 


2D. R. Hartree, Proc. Cambridge Phil. Soc. 24, 89 (1928); 
ti mo Z. Physik 61, 126 (1930); r C. Slater, Phys. "Rev. 35, 210 
19, 

ng a survey, see D. R. Hartree, Repts. Progr. Phys. 11, 113 


(1 

4R. S. Mulliken, J. chim. phys. 46, 497, 675 (1949). 

i os - a: Roothaan, Revs. Modern Phys. 23, 69 (1951). 

6 J. C. Slater, Phys, Rev. 81, 385 (1951); 82, 538 (1951); see 
also his series of Technical Reports of the Solid-State and Molecu- 
lar Theory Group at Massachusetts Institute of Technology 
1951-1954 t tenpaiitehed). 
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to describe the entire physical situation of the system 
in the Hartree-Fock approximation, it is therefore suf- 
ficient to describe its first-order density matrix (3). The 
theory of electronic structure of molecules and crystals 
may consequently be reduced to a comparatively simple 
quantity. 

Due to its connection with the “‘independent-particle 
model,” the Hartree-Fock scheme has further a physical 
simplicity and visuality which is of importance in dis- 
cussing, €.g., ionizations and excitations of the system. 
We will here investigate these properties in greater 
detail with the particular intention to find out whether, 
in some way, it would be possible to preserve this 
visuality also in a more exact theory based on configura- 
tional interaction. In this connection, we will consider 
physical quantities which are represented by operators 
of the form (I, 2), and we note that even many-particle 
operators may occur. This is of importance not only in 
the nuclear theory but also for the extension of the 
ordinary Hartree-Fock scheme to include degenerate 
systems and correlation effects, which will be discussed 
in a following paper. 


1. DENSITY AND TRANSITION MATRICES IN THE 
HARTREE-FOCK SCHEME 


In Part I, it was shown that all physical properties of 
a system of V antisymmetric particles may be charac- 
terized by means of a series of density matrices (I, 3) of 
various orders, and that a transition between two states 
of the system may be described by a similar series of 
transition matrices (I, 17). We will now investigate the 
special form of these fundamental quantities in the 
Hartree-Fock approximation. 


(a) Density Matrix for a Single Slater Determinant 


Let us start by considering a wave function of the 
form 


U (xixe: + -xw) = (N!)-? det{u1,2,---uw}, (4) 


which is built up of a set of V spin-orbitals 11, v2, «+ - uy, 
being’ linearly independent but not necessarily orthog- 
onal, The basic spin-orbitals may therefore have non- 
orthogonality integrals 


f tu:* (x1) ui(xi)dx1=d (kl), (5) 


which are different from zero, if k/. According to 
(I, 39), the normalization integral for U is then given by 


f U*U (dx) =D=det{d(#}. (6) 


The wave function (4) is characterized by the fact 
that, except for an unessential factor which vanishes in 
the normalization, it is invariant against linear trans- 
formations of the spin-orbitals involved. We will con- 


sider a transformation 


N 
“=>, Ualak- (7) 
a=] 
According to the well-known theorem for determinant 
multiplication, we then obtain 


U=U det{ dex}, (8) 
d(kl) => MUa'd(of)ag1, (9) 
D=det{ ayat} -D-det{ agi}. (10) 


Since D0, the matrix d of the elements (5) has an 
inverse matrix d-' having the elements d-(Jk) 
= D(kl)/D. According to (9), the matrix d-' has the 
transformation property 
d'=a-'d- (af). (11) 
In investigating the transformations, we may con- 
sider the basic set 4 (k=1, 2, ---N) as the components 
of a vector in a not necessarily orthogonal Hilbert 
space and the relation (7) as a vector transformation. 
The “length” of this vector, defined by the relation 


p(X1,X2) = x uy* (x1) (X2)d-* (1k), (12) 


is then the only fundamental invariant against (7), and, 
by using (7) and (11), this invariance is easily checked. 
A quantity of this type was first introduced by Fock? 
and investigated in detail by Dirac’ for the orthogonal 
case (d,;=6,1), but, considering the applications to 
molecules and crystals with atomic orbitals having 
overlap integrals essentially different from zero, we 
have here carried out the generalization to the non- 
orthogonal case. By using (5) and (12), we find that o 
fulfills the two matrix relations 


=o, Tr(e)=N, (13) 


and @ is therefore a projection operator; see also (I, 85). 

Since the general densities (3) of various orders are 
all invariant with respect to the transformation (7), we 
may expect that they must be functions of the funda- 
mental invariant (12), and the explicit form for these 
functions is easily found. Since D-'=det{d-(Jk)}, we 
get for the density matrix of order V: 


TO) (xy'xe!+ + +x! | xixe- - -Xy) 
= U* (x/x9!+ + -xw’)U (xix: + -xw)D™ 
= (WN!) det{u.*(x,’)} det{u:(x;)} det{d-!(1k)} 
= (N 1) det{o(x;’,x;)}, (14) 


where we have used (12) and the ordinary law of 
determinant multiplication. The higher densities may 
now be derived successively according to (I, 5) and 


7™P. A. M. Dirac, Proc. Cambridge Phil. Soc. 26, 376 (1930); 
27, 240 (1931). ' 
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(13). Let us assume that the density of order p has 
the explicit form 


T (?) (x'x2’- ° “Xp! | XiXe° é *X») 
p(x1',X») 


p(X2'Xp)}, (15) 


p(x1',X1) p(x1',X2) 
= (p!)“]e(%2',x1)  p(X2',X2) 


p(Xp ,X1) p(X» ,X2) p(Xp Xp) 





TC) (xy!xq! + + +x5!|x1X0" * Xp) 
u*ky(x1’) U*ky(X1') 

= (pl) re eo: 

yo u*ky(Xp’) ia kp (Xp) 


= (pI Qe w¥kr(xr’)- - kp (Xp')ustr (x1) - + - 4p (Xp) 


hiks:- +k, 
hla-o-ly, 


According to (I, 47) and (I, 48), we hence obtain for 
the density matrix of order p in the space defined by the 
nonorthogonal basic set # (k=1, 2, ---N): 


T(?) (Il: + -Ip| Riko: + Rp) 
d— (Lk) 


d*(1pk1) +++ d*(Ipk,) 
The charge- and bond-order matrix of order # is there- 


fore entirely characterized by the components of the 
matrix d-', and for p= 1, we obtain in particular 


>(1|k)=d-1(1k). (18) 


Let us now turn back to the x-space. According to 
(15) for p=1 and p=2, we have particularly 


ad (1 ik ») 


— #1) 


(17) 


¥(x1'|x1)=p(x1',x:), 
p(x1’X1)  p(X1’,X2) 
T'(x1’x2" | x1X2)=4 : ’ 
p(X2 5X1) p(X2 5X2) 


i.e., the first-order density matrix is identical with the 





(op) w= + f Qyp(1',1)der+ (2!) f Ors 


+(3!)" 


11(X1) 


mith’ 


p(1’,1) p(1’,2) 
p(2',1) p(2',2) 
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which apparently is true for p=. By developing this 
density after its last column, putting x,’/=x,, and 
integrating by using (13), we find that the density of 
order (p—1) has exactly the same form, which proves 
our theorem. However, by reversing the arguments 
used in forming (14), we may now expand the density 
of order » by using the law for the determinant of the 
product of two rectangular matrices® in the following 
way: 


Ulp (x1) re (L1k:) diediig 


d~(1,k1) 


q (Likp) 


d(Ipkp) 


Ulp(Xp) 


ad (11k) qd (likp) 


. . (16) 
d* (I php) 


d(lpk1) ++: 





fundamental invariant (12). This means that, if the 
total wave function is approximated by a single Slater 
determinant, the first-order density matrix determines 
also all the higher-order density matrices by (15), the 
normalized wave function by (14), and hence the entire 
physical situation. We can now also make a proper inter- 
pretation of the first relation o?=o in (13), which is 
equivalent with the relation y’=y; it means that the 
eigenvalues of the matrix y(/|%) or the occupation 
numbers # must be either 0 or 1. This result is charac- 
teristic for the Hartree-Fock approximation, and, in 
connection with our results in part I (Sec. 4), we have 
then shown that the relation y’=y is the necessary and 
sufficient condition for reducing the total wave function 
W to a single determinant. 

In the Hartree-Fock approximation, it is hence not 
necessary to specify either the total wave function 
or the special set of spin-orbitals used in the calcula- 
tions, since all information about the system in the 
specific state under consideration is contained in the 
fundamental invariant p(x:,X2) given by (12). By using 
(15), the fundamental formula (I, 10) takes the form 


10% 


p(1’,1) p(1’,2) »(1’,3) 
ic p(2’,1) p(2’,2) p(2’,3) dx\dxedx3+ Berti (21) 
p(3’,1) p(3’,2) p(3’,3) 


The corresponding formula for the orthogonal case and two-particle operators was first derived by Fock? in his 
pioneer work. The average value of the operator 2p may now also be expressed directly in terms of the basic set 


® See part I, reference 7. 
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Introducing the matrix notations 


(k|,|2) = J auy* (1) Qy0e,(1) der, 


(kiko | Qie | hile) = fuss ( 1) 1k9* (2) Qiguts (1) 122 (2)dxidxe, 


and putting expansion (16) into (I, 10), we obtain 


d (l ik 1) d (lyk 2) 


(Qap)n= Oo +E (| 91| d+) + (2 me (kake| Q12| 2122) d"(Igks) d= (loks) 


hile 


+(3!)7 3 (Rikoks|Qies|ilels)| - + 
hikeks 


hildls 


This is the general formula for the average value of an 
arbitrary physical quantity (I, 2), containing also 
many-particle operators, for a nonorthogonal basic set 
4, The nonorthogonality problem has been discussed 
rather extensively in the literature:? we note that the 
corresponding formula for two-particle operators, pre- 
viously given by the author," was derived in an entirely 
different way. 


(b) Transition Matrices for Two Slater 
Determinants 


Let us now consider two Slater determinants U and 
V, built up from two not necessarily orthogonal sets 
(k=1, 2, ---N) and », (/=1, 2, ---N) according to 
(I, 36). If the mutual nonorthogonality integrals are 
given by 


dup (Rl) = f u,*(1)0,(1)da1, (24) 


we have in (I, 39) obtained 


(25) 


f U*V (dx) =Dyy=det{d.,(E0)}. 


For the discussion in this paper, we will assume that 
the basic determinant Dvyy is essentially different from 
zeto: Dyy0. In this case, the matrix d., of the ele- 
ments (24) has an inverse matrix d,,~! with the elements 
d,.* (Ik) = Duv (kl)/D. 

Except for unessential factors, each determinant U 
and V is invariant against linear transformations of the 


type (7): 


h=) Ualeak, i= De vabgi, (26) 


_—_— 


*See Part I, reference 10. 
” See part I, reference 11. 


d“ (l ik 1) bee (l ik 3) 


d— (Isks) 


tees. (23) 
d— (Isk1) 





and the matrices d,, and d,,~! have then in matrix form 
the transformation properties 


duy=a'dugb, dus t=b-du (at). (27) 
Let us now consider u and » as vectors in two associated 
nonorthogonal Hilbert spaces, and the transformations 
(26) as vector transformations. Then the “scalar 
product” of these two vectors, defined by 


Puv (x1,X2) aaa s u,* (x1) 07 (x2)d-! (Jk) ’ (28) 


must be the only fundamental invariant against the 
transformations (26), and, by using (26) and (27), this 
invariance is easily checked. Using (24), we find that 
the matrix ou, also fulfills the characteristic relations 
(13) for a projection operator. 

The transition matrices of various orders may now be 
calculated in the same way as for w=». Introducing the 
normalization constant 


kuv = Dov(DuvDyv)-}, 


we find for p=N: 


(29) 


Tuy? (x1’x2’: e *Xy’ | X1Xe° ° Xv) 
*Xw’) V (x1Xe° + -Xw) det{due(k)} 
=kuv(N!)~ det{ous(xi’,x;)}. (30) 


=KUV U* (x1’Xo’ = 


The density matrices of lower orders may then be 
found successively by using (I, 5) and (13), and the 
form is the same as in (15), (16), and (17) with o and 
d- replaced by ou, and d,,~', respectively, multiplied 
by the normalization constant (29). 

In analogy to (21) and (23), we get therefore for the 
transition element of an operator 2,, with respect to 
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two Slater determinants U and V with Dyy~0: wh 
for 
(Up V)= [ U*Mp¥ (ds)/(DowDrv)) ai 
ope 
Pur(1’ 1) Pur (1 2) ma 
~r0r| G+ f Mee (t,t)der+ Qf ; ; dxydxo+:-- Bm 
Pur(2’,1) Puv(2’,2) 
Guy (Liki) duy (Ike) 
ror] Qeo+ F912 *(H)+ (2 NY {Rike| Qr2| Lilo} + 7} (31) 
kl ey "i (lok1) d,;* (leks) 
where the matrix elements are now defined by (I, 50). the set y= , is still undetermined on a unitary trans- wh 
In the special case when Dyy=0, the invariant (28) formation of spin-orbitals. dir 
no longer exists, and formula (31) looses its meaning. We will now determine the dest choice of the set y;, & to 
The general method developed in Part I, Sec. 2 b, by applying the variation principle (I, 22). Using (21), & mi 
covers also this case, but it remains to investigate varying the set y, and taking the orthornormality § — 
whether the various transition matrices may be reduced __ condition (34) into account by introducing a Hermitean 
to some simpler fundamental invariants also in this matrix A(/|k) of Lagrangian multipliers, we obtain a § wh 
case. We will here leave this problem open. set of conditions which, according to (I, 92), may be 
condensed in the form 
2. HARTREE-FOCK EQUATIONS IN TERMS OF ‘ r 
THE FUNDAMENTAL INVARIANT ; p(E'X1) p(E1’,X2) 
we do(b)+ f On de, ” 
(a) Variation of an Orthonormal Basic Set p(Xe’,X1) p(X2’,X2) the 
Let us now determine the basic set #1, #2, -+-uy in p(&1',X1) p(E1’,X2) p(E1’,Xs) all 
such a way that the Slater determinant (4) form as 1-1 ; : . ope 
accurate an approximation as possible to a solution of # (21% J Mr2a} (H2',H1) (2! X2) p(s’ Xs) | dradsrs to 
the eigenvalue problem: p(X3',X1) p(Xs’,X2) (X3’,Xs) ma 
we 
QV =WY, (32) +-+++=)(E'|x1), (36) gia 
where 2,, given by (I, 2) may contain also many-par- where the operators 2 do not work on the primed § pre 
ticle terms. Up till now we have not imposed any par-_ variables and, after the operations in the integrands 
ticular condition on the set %, but, by a suitable linear have been carried out, we have to put all x,’=x;. This 
transformation, this set may now be orthonormalized is the Hartree-Fock equation for the fundamental in- I 
without changing the character of the total wave func- variant p(&,’,x:). The function sim 
tion. For this purpose, we will use the formula": (Er! |x) =D va* (Ea(xa)ACE| 2), (37) fun 
ee, i h ht-hand wh here d db a 
- in the right-hand member was here derived by using § has 
and we note that - set vk G=1, 2, ++-N) has the ithe orthonormality condition (34) for the individual Jj par 
required orthonormality property: spin-orbitals yx, but we will later see that we can for- in| 
mulate this auxiliary condition also in terms of the § vis 
f Vu pide = 5x1. (34) invariant p without reference to any orthonormality Jj oti 
property. par 
Putting (33) into the fundamental invariant (12), we Since the density matrix (35) is already on diagonal § pat 
obtain form, we may now use the remaining unitary trans- fj duc 
N formation of the set ¥ for diagonalizing the matrix §j ma 
p(X1,X2) =>. We* (Xi)yu(X2), (35) A(i|k) in the function (37), and we will denote the ] 
kel eigenvalues by «,. Applying (I, 96), we may now 
which is just the Fock-Dirac density matrix with the rewrite (36) in the form 
spin functions explicitly included in accordance with V(x) vas) 
the recommendations by Slater. This is also the diag- Qa (x1)+ f Qe anc 
onalized form (I, 74) of the first-order density matrix p(Xe’,X1) p(Xe’,X2) 
(X:|X2), and, due to the degeneracy m1= = - --my=1, Ya(x:) velxs) Ya (xs) 
us rt I, ref 11. Th trix d~+ may here be con- 
savanna by sraniesting the Hormitcan matrix d nd diagonal soins +(2!)** f Qies|p(Xe';X1) p(Xe’,X2) p(Xe’,Xs)|dxedxs wh 
by a unitary transformation U, taking the inverse of the.square 
root out of the diagonal elements (which are all positive), and p(Xs’,X1) p(X3’,X2) (Xs’,X2) 
going back to the original representation by the unitary trans- V 


formation U'. : 


(38) 


+++ =one(x), 
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which is just the set of ordinary Hartree-Fock equations 
for the individual functions y, (k=1, 2, ---N) gener- 
alized to many-particle operators. Introducing the 
operator P,; for permuting the variables x, and x;, we 
may instead write (38) in the form of an eigenvalue 





1 Px» 


Qete(1) = Qi+ f Qie 


p (x2’,X1) p (X2’,X2) 


where the permutation operators P,; should work 
directly on the wave function, i.e., should be written 
to the right of the p-factors in expanding the deter- 
minants. In this operator, the elements P::=1 give 





Qete (1) =Qi+ Vop(1), 


where 


As before we have here used the convention that, after 
the operations have been carried out, we have to put 
all x,’=x;,. It is easily shown that Qe, is an Hermitean 
operator, and this implies that the eigenfunctions y; 
to (39) belonging to different eigenvalues w, are auto- 
matically orthogonal in consistency with the fact that 
we have here transformed away all nondiagonal Lagran- 
gian multipliers, which otherwise could be used for 
preserving the orthogonality. 


(b) Spin-Orbital Interaction 


By means of the variation principle (I, 22) and the 
simplifying assumption about the form (4) of the wave 
function, the original eigenvalue problem (32) in con- 
figuration space for the many-particle operator (I, 2) 
has now been reduced to an eigenvalue problem for one 
particle in the ordinary x-space. The effective 2 operator 
in (39) has the advantage of a certain degree of physical 
visuality: it consists of the one-particle term in the 
original operator plus an “average potential” on the 
particle, depending on its interaction with all the other 
particles. This visuality may be emphasized by intro- 
ducing the individual spin-orbitals y, or the density 
matrices p,(X1,X2) =z" (X1)¥%(X2) associated with them. 

Expanding the density matrix (35) in the form 


(sms) > pa(t.%9), (43) 


and substituting this expression into (42), we obtain 
Vor(1I)=X V'(1)+ (2!) D Vim(1)+---, (44) 
l lm 


where 


vi(1) = f O12(1—Prs)px(2!,2)dere, 


Vop(t)= f Qix(1—Pu)o(2',2)dra+ (2!) f (1 Pia P3) 


1495 
problem 
Qete(1) Px (X1) = cone (x1), 


where the “effective” one-particle operator Qers cor- 
responding to (I, 2) has the explicit form 


(39) 


Piz Piz 


dxe+ (2!) f 123] (X2’,X1) p(Xe’,X2) p(Xe’,Xs)|dxedxs+---, (40) 


p(X3',X1) p(Xs',X2) p(Xs',Xs) 





rise to ordinary potentials, whereas the elements P1; 
for 71 give rise to exchange potentials. Expanding the 
determinants after their first rows, we can rewrite the 
effective 2-operator in the form 


(41) 
p(2’,2)  p(2’,3) 


dx—dx3+-->. 
p (3’,2) p (3’,3) 


(42) 





y'=(1)= V'(1)= f Q(t Pie Pr) 


pi(2’,2) — pi(2’,3) 
el dacadacs; 


45 
Pm(3’,2) Pm(3’,3) 


Here each term has a specific physical meaning: V'(1) 
is the potential on particle 1 arising from another par- 
ticle in spin-orbital / due to the two-particle interaction 
operator 2:2; V'"(1) is the potential on particle 1 
arising from a pair of particles in the spin-orbitals / 
and m due to the three-particle interaction operator 2123; 
etc. The effect of these operators on an arbitrary func- 
tion f(x:) is demonstrated by the formulas 


f)—f) 
v1(1) fx) = f Ou px(2'1) _py(2!,2) dxe, 
vi (1) f(x:) 
fa) 2) f@) 
m f Drva|or(2,1)pr(2!,2)pr(2',3) |derades; 
pm(3’,1)  pm(3’,2)  pm(3’,3) 


(46) 


In discussing the physical interpretation of the poten- 
tials (45), we note that the first term in each of them 
has an almost “classical” meaning, whereas the terms 
containing the permutation operators Pi; depend on 
the antisymmetry requirement (I, 1) and therefore cor- 
respond to typical quantum-mechanical effects. We 
have seen that, in the x-space, the antisymmetry leads 
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to the relation (I, 8) and the existence of the “Fermi 
hole,” and we observe now another consequence, 
namely : 


V¥(1)yx(x1)=0, V*(1)y.(x1)=0, 


which means that the particle in spin-orbital & does not 
interact with itself. This important result is not limited 
to the ordinary Hartree-Fock approximation, for, in 
considering the extended Hartree-Fock equations (I, 
89), (I, 90), (I, 92), and (I, 96) for limited configura- 
tional interaction, we find that the coefficients for the 
interactions of various orders are given by the matrix 


(47) 





Vat= fV*Q)o(,1)de= fs 
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elements 


TP) (,],: « -Ly| Rike- kp), (48) 


which are antisymmetric in each set of their indices, and 
which therefore vanish identically if two indices in a 
set happen to be the same. In the nonrelativistic 
quantum theory of antisymmetric particles with static 
interaction, there is consequently no self-energy prob- 
lem, but, unfortunately, one has so far not been able 
to generalize this result to relativistic theories and time- 
dependent interactions. 

In this connection, we will also introduce the total 
quantities: 


pe(1’,1)  px(1’,2) 
Xx 

pi(2’,1) pi(2’,2) 
pe(1’,1) pe (1,2) pe(1’,3) 


10X2, 


Vaimm f vim(t)o(t,t)dsi= f Os pi(2’,1) pi(2’,2) pi(2’,3) dx\dx_dx3; 
Pm(3’,1) Pm(3’,2) pm(3’,3) 


where V;'=V;* is the total potential between the spin- 
orbitals k and / due to the two-particle interaction Qi, 
V,'"=V,;"*=V,,*' is the total potential between the 
spin-orbitals k, /, and m due to the three-particle inter- 
action Q123, etc. Using (39), (44), and (49), we now 
obtain for the “spin-orbital eigenvalue” uw, : 


_ f Ya" (1) Qeee(1Ye(1)dey 


= font Ndntd Vii + (217 V+, (50) 
l Im 
giving a simple physical interpretation of this quantity. 
We may also express (Q.p)s in terms of the total spin- 
orbital interaction potentials (49), and, according to 
(21) and (43), we obtain 
(op)u= +E f oe(t'A)ae 
k 


FANT VEI+ GIL’ Vimt--, (51) 
kl m 








where, due to the factorial coefficients, each interaction 
term V;', V;'™, etc. will be counted only once. Compar- 
ing (50) and (51), we get also 


cantante: w— 20’ Vik —3 D0’ Vilm—---, (52) 
k=l kl kim 


showing that (Qop)w is different from the sum of the 
eigenvalues, since the interaction potentials are counted 
in different ways in these two quantities. 


(c) Average Exchange Potentials 


The exchange potentials in (42) are of a rather com- 
plicated character, and Slater’ has therefore proposed 
that they should be approximately replaced by “average 
exchange potentials” being ordinary functions of x;. In 
Part I, we have shown that this can be strictly done by 
minimizing the weighted “error sum” (I, 100), and, 
according to the general formula (I, 103), we then 
obtain in the Hartree-Fock approximation: 


p(1’,1) p(1’,2) p(1’,3) 


11) p(1’,2 
V (x:)= f Qi2 sila ett +20 f 123|0(2’,1)  (2',2) p(2’,3) |dxedxs/p(1,1)+--+. (53) 


p(2’,1)  p(2’, 


For 0;;=¢e/r;;, 2:;,=0, this is just the potential intro- 
duced and investigated in detail by Slater.’* However, 
an essential difference between (42) and (53) will later 
be pointed out in connection with the treatment of the 
“virtual” solutions to the eigenvalue problem (39). 

12 J. C. Slater, Phys. Rev. 81, 385 (1951). 


18 See also J. C. Slater, Phys.. Rev. 82, 538 (1951); G. W. Pratt, 
Jr., Phys. Rev. 88, 1217 (1952). 


p(3’,1) (3,2) p(3’,3) 


(d) Variation of the Fundamental Invariant 





In considering the problem how to determine the 
Slater determinant (4) in order to get as accurate 
approximation as possible to an eigenfunction of the 
operator Qop, we have in the previous section applied the 
variation principle by varying the basic orthonormal 
set y,. In connection with the orthonormality condition 





(48) 


, and 
in a 
ristic 
tatic 
rob- 
able 
‘ime- 


total 
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(34), we have then introduced a matrix A(/|k) of 
Lagrangian multipliers having the eigenvalues a,. 
However, according to (14), the total wave function 
(4) is dependent only on the fundamental invariant 9 
defined by (12), and we will now instead treat the same 
problem by varying this quantity as a whole. Since 
fulfills the relations (13), we get for its variation 
59=dp(x1,%2) : 


50=p50+d5e-0; Tr(de)=0. 


Varying expression (21), and using (40) and the first 
relation (54), we then obtain 


(54) 


Han f Quu(t)80(1 1)der= f200.2)00u(0 


Xp(2,1)dx:dx2+complex conj. (55) 


The auxiliary conditions for the variation of p are 
contained in (54), and the problem is now to express 
them in convenient form. Combining the first relation 
(13) with the first relation (54), we get 

:50-9=0. (56) 
In the terminology of Part I, Sec. 5, this means that 5p 
is without orthogonal projection within the subspace 
of the Hilbert space defined by the matrix o. If 
A=A(xX2,x:) is an arbitrary function and 4=)(Xe,x:) its 
orthogonal projection within the same subspace, defined 
by the matrix relation 


4= Ao, (57) 


then the “scalar product” of 59 and must be zero, 
and the direct proof is simple: 


Tr(6g-4)=Tr(de-oAg)=Tr(ed9g-A)=0. (58) 


This is the auxiliary condition desired, and it can be 
expressed in the same form as (55), if we assume that 
4 is an Hermitean matrix and we further add the com- 
plex conjugate to (58): 


f 5p (1,2) (2,1)dx1dx2+complex conj.=0. (59) 


Combining (55) and (59), we obtain 
Qete(1)p(X2,X1) =A (X2,X:1), 


which is the Hartree-Fock equation for the funda- 
mental invariant o. This equation is, of course, identical 
with (36), but we note that the function \(xe,x1) in 
the right-hand member is here expressed directly in 
terms of » according to (57). If the basic set is chosen 
orthonormal, \(x2,x1) may then be expanded in the 
form (37), see (I, 82) and (I, 86), and we obtain the 
connection with the previous theory. 


“See also J. Frenkel, Wave Mechanics, Advanced General Theory 
(Clarendon Press, Oxford, 1934), p. 435. 


(60) 
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Instead of solving a set of N Hartree-Fock equations 
for functions ¥.(x:) (k=1, 2, ---N) of a single variable, 
we can therefore principally treat the same problem by 
solving a single equation (60) for a function p(x:,xs) 
of two variables, fulfilling the auxiliary conditions 


=o, Tr(o)=N. (61) 


From the practical point view, we do not know any 
convenient direct numerical method for solving equa- 
tions like (60) and (61), but, with the development of 
the modern electronic computers, the situation may be 
changed. For the moment, Eq. (60) has mainly principal 
interest, and it may serve as starting point for theories 
based on approximate forms of the matrix p(x,x2), as 
the statistical approximation. 


3. SOLUTION OF THE HARTREE-FOCK EQUATIONS 
BY EXPANDING THE EIGENFUNCTIONS IN A 
FIXED COMPLETE SET 


(a) General Theory 


The essential problem in the Hartree-Fock method, 
where the total wave function is approximated by a 
single Slater determinant, is the solution of the one- 
particle equation 


Qese (1) Yi. (X1) =e (X1), 


for the basic spin-orbitals y, (k=1, 2, ---N), where 
Q¢4(1) is given by (40). The Hartree-Fock equations (62) 
form together a system of coupled nonlinear integro-dif- 
ferential equations connected with an eigenvalue 
problem, and, since they therefore have a rather com- 
plicated character, we will discuss the methods for 
solving them in greater detail. 

Hartree?* has shown that, for a single atom or ion, 
Eq. (62) is separable in polar coordinates, and that, 
after elimination of the angular part, it remains to 
solve a system of nonlinear radial integro-differential 
equations. Hartree and his collaborators have solved 
this system numerically by a method of successive 
approximation: one starts from trial values of the spin- 
orbitals, calculates » and the corresponding potentials 
(45), and introduces them in the effective Q operator. 
For this fixed operator, one then determines the first V 
eigenvalues and eigenfunctions by numerical integra- 
tion, which then may be used for a new evaluation of o 
and the potentials, etc. The process is carried on until 
it becomes self-consistent, i.e., until two successive 
approximations agree within the accuracy desired, and 
the procedure is therefore called the “self-consistent- 
field” method. 

In molecular and crystals problems, it is usually not 
possible to separate Eq. (62) by using any particular 
form of coordinates.’ In such cases, one may try 
another approach, namely to expand the eigenfunctions 


(62) 


16 The only exceptions would be systems of extremely high 
symmetry; compare the “cellular” method for crystals and its 
modifications. 
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Fic. 1. The form of the rectangular matrix c, which has unitary 
properties at least in the vertical direction. 


to (62) in terms of a fixed complete orthonormal set ¢, 


(u=1, 2, soe): 


v(n)=¥ u(X1)Cp. (63) 


For the following general discussion, it is not necessary 
to specify the detailed character of our many-particle 
system, which may be an atom,’® molecule, crystal, or 
atomic nucleus. For the sake of simplicity, we may 
formally assume that the set has only finite order M, 
and afterwards we will then let M->~. 

Since the operator Qers is Hermitean, its eigenfunc- 
tions y (k=1, 2, ---N) are automatically orthogonal, 
and we will further assume that they are normalized to 
fulfill (34). We note that, in addition to the WN eigen- 
functions which are used in constructing the density 
matrix o, there may be also higher solutions for 
k=N+1, N+2, ---, but, for the moment, we are not 
interested in these “virtual spin-orbitals.” The main 
problem is now to determine the coefficients c,x in (63), 
which form a rectangular matrix of order MXN with 
M>WN. It is here understood that (63) is momentarily 
replaced by the approximate expansion 


ve (X1) “= Pu(X1)Cyk- (64) 


Substituting (64) into the orthonormality condition 
(34), we obtain 


M 
D Cut Cur=Sx1, (65) 


u=l 


which may be condensed in matrix form to c'c=1, if 
we strictly observe that the symbols ¢ and ce! indicate 
rectangular matrices. We note that the matrix ¢ has 
unitary properties at least in the vertical direction, but 
that we cannot prove the complementary relation 


16 We note that, for atoms, the method may be used ‘alter- 
natively with Hartree’s conventional treatment applying nu- 
merical integration. 


cc'=1 without completing the matrix to square form 
by investigating the character of the virtual orbitals, 
The properties of:c are indicated in Fig. 1. 

In Sec. 1 we have shown that, in the Hartree-Fock 
approximation, the entire physical situation of the 
many-particle system may be described by the funda- 
mental invariant » defined by (12) or (35). Putting 
(64) into (35), we obtain 


N 
p(X1,X2) = 2 Vi* (x1) Wi (X2) = > ; Gu" (X1) Gr(X2)O (vu), 


B= 
(66) 
where we have introduced the symbol 


Q(m) > Coahengt. (67) 


The entire physical situation is now instead determined 
by the quadratic matrix Q of order MXM. 

Relation (66) gives also the first-order density 
matrix y of the system. In Part I, the matrix of the 
coefficients y(v|u) in the expansion of this density 
matrix with respect to a particular basic set gy, has 
generally been called the charge- and bond-order matrix 
with respect to this set. We will here use the special 
notation 


v(»|u) =Q(m), (68) 


in order to indicate that we are considering the Hartree- 
Fock approximation. The physical interpretation of the 
elements is given in Part I, Sec. 2. We note also that, 
in forming (66), we are carrying out the reverse to the 
procedure used in forming (I, 74), since we are here 
going from natural spin-orbitals to an arbitrarily chosen 
basic set gy. 

Since o fulfills the relations (13), the same must be 
true also for the matrix Q: 


Q@=Q, Tr(Q)=N, (69) 


and we can easily check these relations by using (67). 
The Hermitean matrix Q is therefore a “projection 


operator” in the sense of Part I, Sec. 5. It has W eigen- | 


values equal to 1 and (M—JN) eigenvalues equal to 
zero, and the eigenvectors associated with the eigen- 
value 1 form together the rectangular matrix c. Hence 
we have 


Qc=c, fQc=1, (70) 


where the unit matrix of the last relation is a square 


matrix of order V. These relations are easily checked by 
using (65) and (67). 

Let us now consider the density matrices in the 
u space. Putting (66) into (15) and using the law for 
forming the determinant of a product of matrices, we 
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obtain 


(x1 |xs)=E oy*(x1’) 9-(x1)0 (rn), 
T’) (x1/xe’- . *Xp’ | X1Xe- x *Xp) 


gui*(X1') +++ Qup*(X1’)| |grr(x1) +--+ Grp (%1) 


Q(vui) +--+ Q(vpy) 


=(p)" Lk 
Ries: gui*(Xp') +++ gup*(Xp')| |gri(Xp) -** Grp(Xy)| |O(vpMi) --- QO(vpuy) 

Q(vim) i> QO(vitp) 

= (PX ent (as’)---enp*(Kpon(ai)---err(%)| oe 

wipes spp QO(v pm) ales QO(v pty) 


Q(vmi) QO(vmte) +--+ O(vumy) 
Q(vpt1) Q(vpte) -+* O(Ypty) 


All charge- and bond-order matrices of higher orders may therefore be expressed in the fundamental matrix Q, 
and this is true also for p= J, i.e., for the wave function itself. Introducing the matrix elements with respect to 


the set gy: 


which gives 


T'?)(pyve* ++ vp| mates © Mp) = (Pp!) (72) 


[ul QalrI= f eo%(1) Qe, (1)des, 
[uape| Q12| rans} f oni*(t)on2"(2)Qiaea( 1) on(2drades, 


and applying (I, 10), we get the fundamental formula 
Q(vim1) QO(vme) 


Qop w= Qo) Q4| vO (yp)+ (2!) 2| Qi2| vive 
(op) w= 20) oll | vJQ (yn) + (2!) © [am | Novus) Otvaus) 


Q(vimi1) +++ Q(vms) 
+(3!)7 YO Copous] 2123| v1v25 ] ce 6 9 t Ieee, ® 
wipes Q(vsu1) Picne Q(vsus) 


vives 


showing that the average value of a physical quantity 2.) may be expressed in terms of the matrix elements (73) 
and the charge- and bond-order matrix Q. 

The essential problem is now to solve the Hartree-Fock equation (62) and determine the matrix Q. Since (62) 
for a fixed Qer: operator represents a linear eigenvalue problem in the ordinary x-space, we may again apply the 
variation principle (I, 22). Forming the average value of Qe¢, by using expansion (64) and omitting the index k, 


we obtain 


_ f Y*(1) Qere(1)(Ddy / f VU ds=Z 6,*Ta Qe eTe/E oben (75) 


where 
B | QeeelyJ= f est (1) Qetr(1) go(1)der 
g(1) (2) 
(2,1) 0(2,2)| 
¢(1) (2) (3) 

+(2)+f est(1)Qafo(2'1) 0(2!,2) 0(2'3)|dedeadzet--- 

(3,1) 0(3',2) (3',3) 
=[w] 21] J+ Q(voue)(1— Pove)[uue| Q12| vv2] 
i Q(vaus) O(vaus) 


+ (2! me (1— Pw2— Pw 3| Q193| vvavs I+ °° 
(2!) a ead Witaais ) Cumens| Qi23| vvavs] 


10X2 


=[ul91|>]+ f en*(1)Q 
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As in (44)-(52), the operator Q.¢; may here be expanded 
in terms of potentials from the densities ,*¢,0(vu), 
and the matrix elements (76) may then be interpreted 
analogously. 

Keeping Qe; fixed, varying the coefficients ¢ in (75), 
and putting 60=0, we obtain a system of linear equa- 


tions 

D{Le| Mere] ¥]—w5»},=0, (77) 
which is soluble only if 

det{ [u| Qeee| v]—w5,»} =0. (78) 


This secular equation is an algebraic equation in w of 
the order M, and it has therefore M eigenvalues ux, 
(k=1, 2, ---M) which are all real. After the eigenvalues 
have been determined, we may solve the M systems 
(77) giving a quadratic matrix ¢,, of order M. In addi- 
tion to the N ordinary solutions, in which we were 
originally interested, we have therefore also obtained 
(M—N) “virtual” solutions, which will be discussed 
later. 

The whole nonlinear problem may now be solved by 
a process of successive approximations which is anal- 
ogous to Hartree’s “self-consistent-field” method for 
atoms. One starts from trial values Q® of the funda- 
mental charge- and bond-order matrix, evaluates the 
matrix elements (76), solves the secular equation (78), 
and thereafter the linear system (77). From the rec- 
tangular matrix c, corresponding to the N ordinary 
solutions of (78), one may then form a new approxima- 
tion of the matrix Q by using the definition (67), and 
the procedure is then repeated until it becomes “self- 
consistent,” i.e., until two successive approximations 
agree within the accuracy desired. 

The method of expanding the Hartree-Fock functions 
in terms of a fixed given set was first used in the 
molecular orbital theory in investigating the electronic 
structure of molecules and crystals. However, a first 
systematic treatment of the variation problem, em- 
phasizing its nonlinear character, was first given by 
Roothaan,® who varied the coefficients c,, directly in 
the total quantity (Q.»). The derivation given here 
follows more Hartree’s conventional scheme and, in 
addition, we have pointed out the essential simpli- 
fications of the calculations which may be obtained by 
introducing the charge- and bond-order matrix Q. 


(b) Direct Variation of the Matrix Q 


One could also consider the matrix Q as the funda- 
mental variable in the problem and vary this quantity 
as a whole instead of the coefficients c4. By varying 
expression (74) and using (76), we obtain 


5(Qop v= Dnlp | Qett | v }6Q (vu) =Tr(Qer: 4Q). (79) 


The auxiliary conditions may be treated analogously to 
Sec. 2 (d). Since Q fulfills the relations (69), we get for 
its variation : 


6Q=Q50+50-Q, Tr(6Q)=0, (80) 
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and further 
Q5Q-Q=0, (81) 


showing that 6Q is without orthogonal projection within 
the subspace of the general Hilbert space defined by the 
projection operator Q. If A is an arbitrary matrix of 
order M and 2 its orthogonal projection within this 


subspace defined by 
4=QAQ, 
(82) 


M 
Aeg= DL Cen! Aprlrs, 


B,v=l 


N 
Aw= LD CutesCjo', 
k,l=1 


then the “scalar product” of 5Q and 4 is vanishing; 
Tr(6Q-2)=0. (83) 


This relation presents the auxiliary conditions in a 
convenient form, and, by combining (79) and (83), we 


obtain 
[u | Qets| vj= Aur (84) 


This is the Hartree-Fock condition for the charge- and 
bond-order matrix Q, and it says simply that the matrix 
[u|Qe¢e|v_] should belong to the subspace defined by Q. 
Togther with (69), this condition is sufficient for deter- 
mining Q. It may also be expressed in the form 


Qe= Q2.:Q, (85) 


where Qes; as in (79) is the matrix formed by the ele- 
ments [u| Qere| v]. 


(c) Applications to the MO-LCAO-Method in the 
Theory of the Electronic Structure of 
Molecules and Crystals 


In the molecular-orbital theory of molecules and 
crystals introduced by Lennard-Jones, Hund, Mulliken, 
Bloch, and others, the molecular orbitals for the elec- 
trons were assumed to be solutions to a one-particle 
Schrédinger equation, where the “effective” Hamil- 
tonian consisted of the kinetic energy of the electron, 
its potential energy in the nuclear framework, and its 
potential energy in the field of all the other electrons. 
As in the original Hartree scheme, the exclusion of the 
interaction between a particle and itself caused mathe- 
matical as well as physical difficulties, until this problem 
was successfully solved by the introduction of the 
antisymmetry requirement in the Hartree-Fock scheme, 
which automatically eliminated the self-interaction; 
see (47). The effective Hamiltonian of a molecule or a 
crystal may therefore now be properly represented by 
the one-particle operator (40) corresponding to a total 
Hamiltonian of the form (I, 11): 
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but we note that we here have neglected relativistic 
effects (including all spin couplings) as well as the zero- 
point vibrations of the nuclei g, having the atomic 
numbers Z,. 


(d) Fundamental Charge and Bond Order Matrix R 
with Respect to the Ordinary Atomic Orbitals 


In the MSO-LCASO method of treating the electronic 
structure of molecules and crystals, the molecular spin- 
orbitals (MSO) are assumed to be formed by linear 
combinations of atomic spin-orbitals (ASO) ¢,, which 
may be the ordinary or hybridized 1s, 2s, 2p, 3s, --- 
spin-orbitals associated with the atomic constituents of 
the system. In comparison to (a), this assumption leads 
to a complication, since atomic spin-orbitals ¢, and ¢, 
belonging to neighboring atoms are usually overlapping, 
and they have therefore nonorthogonality or overlap 
integrals 


f u*(1)$,(1)dx1= Aw =Swt+Syr, (87) 


which may be essentially different from zero. However, 
by a suitable linear transformations as (33): 


Yu=La Ga(A) ay= Dow dal (1+S)-* Jan; 


the basic set may be orthonormalized, / ¢,* ¢,dx= dy, 
and the general theory developed in (a) may then be 
directly applied to the set ¢, of orthonormalized atomic 
spin-orbitals (ON-ASO). 

Let us consider a molecule or crystal in the Hartree- 
Fock approximation having a total wave function ap- 
proximated by a single Slater determinant, built up 
from V molecular spin-orbitals y, 2, - - -yw. Expanding 
% in the form (64) and using (88), we now obtain for 
the fundamental invariant @ defined by (35): 


(88) 


p(X:,X2) =2 Pu* (x1) Wx (X2) 


= Diu Gu*(X1) o> (X2)O (vu) 
= Dias ba*(X1)hp(X2)R (Ba), 
R= 4-!QA-?= (1+-S)-?Q(1+S)-?. 


We note that the matrix R may also be introduced 
directly without the help of the ON-ASO, if we instead 
start from the expansion 


Vie _ =. a Pal ak 
and introduce R by the definition 


(89) 
(90) 


where 


(91) 


(92) 


N 
R(Ba) = TprVka' 
=] 


Since the set ¥, is orthonormalized according to (34), 
we get further 


rAar=1, (93) 


1501 


for the rectangular matrix 7. of order MXN. By using 
(92) and (93), it is easily shown that, instead of (69), 
the matrix R fulfills the relations 


RAR=R, Tr(A-R)=N, (94) 


corresponding to (13). We note that the “overlapping” 
matrix A occurs in (93) and (94), since it describes the 
“geometry” of the nonorthogonal space, defined by 
the basic set ¢,, and R may then be interpreted as a 
“projection operator” in this space. 

In a previous paper,!” we have called the Q the charge- 
and bond-order matrix of the system and R the “bond- 
ing-overlapping” matrix, but, according to the general 
terminology introduced in Part I, Sec. 2, we are now 
going to change our nomenclature and call Q and R 
the charge- and bond-order matrices with respect to the 
orthonormalized and ordinary atomic spin-orbitals, 
respectively. Let us now discuss the relation between 
Q and R in greater detail, particularly from the chemical 
point of view. 

As an example, we will consider the interaction 
between “closed-shell” ions. This case is nondegenerate, 
and, if we choose M=N, it follows from the comple- 
mentary relation to (65) that 


Q(m) = Sy». (95) 


Since the bond orders vanish for n+», this relation cor- 
responds to the nonexistence of valence bonds between 
closed-shell systems. However, if the ions are put 
together in, e.g., am ionic crystal, the circumstances will 
be changed. In previous papers,'* we have shown that 
the existence of such a crystal depends on the equi- 
librium between the electrostatic attraction between 
the ions and the repulsion due to the overlapping 
between the ions at closer distances. It should be pos- 
sible to express these repulsive properties also in the 
valency language, and we note that, in this special case, 
the matrix R is defined by 


R= “= (1+S)-, 


giving charge and bond orders essentially depending on 
the overlapping between the ions. Numerical applica- 
tions may be found in reference 11 of Part I, and we 
will here only illustrate the result by some recent data 
on LiH by Lundqvist," who has found the first elements 
of the charge- and bond-order matrix associated with the 
1s-orbitals of Lit and H~ to be: 


Ry,= 1.419 for H-, R,,= 1.021 for Li*, 
R,,= —0.052 for LitH- (nearest neighbors), 
Ry=—0.165 for H-H- (next nearest neighbors). 


We note the negative signs and repulsive character of 
the bond orders for nearest and next nearest neighbors. 


17P, O. Léwdin, J. Chem. Phys. 19, 1570 (1951). 

18 See Part I, reference 11. 

19S. O. Lundqvist, Arkiv Fysik 8, 177 (1954). Compare also 
I. Waller and S. O. Lundqvist, Arkiv Fysik 7, 121 (1953), where, 
as Lundqvist has kindly pointed out to me, there are some mis- 
prints in the signs of the bond orders. 
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Lundqvist has used these and higher elements for 
investigating the diagonal element p(x,x) of the density 
matrix, and, in this connection, he has been able to 
draw interesting conclusions concerning the amount of 
“covalent” character of LiH. It should here further be 
remarked that, in investigating the properties of ionic 
crystals, the complete density-matrix formalism has 
previously been used also by Fréman” and by Montet, 
Keller, and Mayer.” 

Another example may be found in a paper by Sponer™ 
and the author, where the matrices Q and R for the 
am bond in ethylene have been compared. 

Our comparison between Q and R shows that, from 
the chemical point of view, Q is the charge- and bond- 
order matrix for separated atoms with nondiagonal 
elements representing the formal valency of the atoms, 
whereas R is the charge- and bond-order matrix asso- 
ciated with the atoms in the molecule or crystal under 
consideration with nondiagonal elements representing 
the actual bond strengths. This result seems to be in 
agreement with the new definition of “relative bond 
strengths” introduced by Mulliken.* 

In discussing molecules and crystals, we have pre- 
viously” also introduced so-called ‘combined atomic 
spin-orbitals” by the relation 


$u°= Dia baR (ap). 


Using the fact that R is a “projection operator” ful- 
filling (94), we now obtain for the fundamental in- 


variant . 
p(X1,X2) = Dap ba* (X1)hp (Xe) R (Ba) 
=D ww bu (Xi) Gr" (Xe) Avy. 
Since this quantity depends only on the set ¢,° and the 


(96) 


(97) 





(Qop)w= Qo) +L (w] Qi] »)R (yu) + (2!) >» (use| Q12| v172) 


vive 


+(3!)7 YX (uipous|Qi23| v1v273)] - 


Bip ws 
vives 


where the matrix elements are given by (73) with the 
functions g, replaced by the functions ¢,. A special 
case of this formula was previously derived by the 
author"? in another way. 


(e) Hartree-Fock Equations in the 
MO-LCAO-Theory 


Let us turn now to the question of determining the 
molecular spin-orbitals y;, i.e., the coefficients r., in 


2” P.O. Fréman, Arkiv Fysik 5, 135 (1952); 9, 93 (1954). 

1 Montet, Keller, and Mayer, J. Chem. Phys. 20, 1057 (1952). 
( a Sponer and P. O. Léwdin, J. phys. radium 15, 607 
1 


BR. S. Mulliken (private communication). 
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overlapping matrix A, all our previous asymmetric 
formulas in reference 15 are now easily symmetrized. 
Using (94), we obtain further 


f $u0*(1)dy0(1)dae1=R (uo). (98) 


This is therefore a case where the overlap integral 
between two one-particle functions really measures the 
charge and bond orders of the system. 

‘From the fundamental theorem derived in Sec. 1, it 
is clear that if, in any way, we could determine or 
measure the first-order density matrix y(x1’|x:) or the 
charge- and bond-order matrix R, then we would have 
all information needed for describing the properties of 
the system with an accuracy corresponding at least to 
the Hartree-Fock approximation. It is possible, e.g., by 
diffraction experiments to determine the average par- 
ticle distribution in the ordinary x-space, corresponding 
to the diagonal element, (x:|x:), but, so far, none has 
been able to devise any experiments for measuring the 
entire matrix y(x:’|x:). This quantity offers, in fact, a 
rather intricate problem, since it gives the description 
of the same physical situation in two complementary 
spaces, and, like the complex wave function,™ it has 
therefore only a symbolic character and can never be 
measured directly. How y(x:’|x:) may be determined 
from two diagonal distributions y(x|x) and (|) in 
complementary spaces is a particular problem.™ 

We will finally give the expression of the average 
value of a physical quantity 2, defined by (I, 2) in 
terms of the fundamental matrix R. The charge- and 
bond-order matrices of higher orders are expressed by 
formula (72) with Q replaced by R, and, according to 
(I, 10), we then obtain 


R(vi1) R(vie) 
R (vout1) R (voqt2) 


R(vys1) R(vips) 


oe “fos ’ (99) 
R(vsus) 


R(veus) : : 





expansion (91). They are solutions to the eigenvalue 
problem (62), and, for a fixed operator Qer:, they may 
therefore be found from the variation principle. Using 
(91), we obtain 


w= [vr Moun nan / [vrmvayan 


=P r,*(u|Qete|»)r/DX ru* Ayr, (100) 


% See, for instance, W. Pauli, Handbuch der Physik (Julius 
Springer, Berlin, 1933), Vol. 24, No. 1, particularly p. 98. 
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where 
(u| Sete ») 
ari (u| | vy)+> R(vy2) (1 — Pw) (upe| 12| v2) 


R(vou2) R( vous) 
+(2!)7 
well a R(vsu2) R(vaps) 


Xx (1— P2— Pw) (juprouts | Q123| vvqv3) + eee 


All matrix elements are here to be taken with respect 
to the functions ¢,, Keeping Qers fix, varying the coef- 
ficients 7, in (100), and putting d0=0, we obtain a 
system of linear equations 


D>{ (u | Qete| v) —wA,,}7,=0, 


which is soluble only if, 
det{ (u| Qete | v)—wA,,} =0. (103) 


These relations are identical with (77) and (78), if the 
Kronecker symbol 6,, in them is replaced by the over- 
lapping matrix A,,. The nonlinear problem of finding 
the coefficients r, may therefore again be solved by a 
method of “self-consistent fields,” starting from trial 
values R of the matrix R, solving the secular equation 
(103), evaluating the coefficients 7,, from (102), and 
recomputing a new approximation of R according to 
(92), etc. 

If the numerical program for solving the secular 
equation (78) may be adapted to the case A,,+0 for 
uv, the occurrence of the overlapping matrix does not 
cause any particular difficulties. The rectangular matrix 
rx has to fulfill the relation (93), but we note that, for 
uv, the relation (r'Ar),,=0 is automatically fulfilled 
and that the solutions to (102) only have to be properly 
“normalized” by a constant coefficient to satisfy 
(r'Ar) w= 1. 


(101) 


(102) 





v-erH 


9 Tig 


il eal») = fo,%(t) | oy(I)der 


Gu" (1) be (2)r(1)br (2) — bu* (1) b* (2) $n (1)$-(2) 
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The MO-LCAO method in its “self-consistent 
field” form was first discussed in detail by Roothaan,$ 
but we note the essential simplification of the pro- 
cedure here obtained by introducing the charge- and 
bond-order matrix R. The first numerical application 
of Roothaan’s scheme was carried out by Parr and 
Mulliken.” 

In analogy to (79)-(85), we may also derive the 
Hartree-Fock equations by varying the matrix R as 
a whole. According to (94), we obtain 


RA-sR- AR=0, (104) 


which leads to the following auxiliary condition: 
Tr(5R- ARARA)=0, (105) 


where A is an arbitrary matrix of the order M. The 
Hartree-Fock equations may then be condensed in a 


single relation 
(u | Qett| v) ar Aw, 


4= ARARA. 


(f) The Coulomb and Exchange Integrals in the 
MO-LCAO Theory of Conjugated Systems 


In the theory of conjugated organic compounds, 
there are particularly two types of quantities which 
are considered to be of importance, namely the 
“Coulomb integrals” a, and the “exchange integrals” 
Buy (uv) defined as the matrix elements of the effect- 
tive Hamiltonian‘ but usually determined by fitting 
some theoretical quantities containing them to experi- 
mental data. In the Hartree-Fock scheme, the effective 
Hamiltonian is now given by the simple expression 
(86), and its matrix elements may then be determined 
on a purely theoretical basis. For the elements of Hes; 
with respect to the ordinary atomic spin-orbitals ¢,, 
we obtain 


(106) 
(107) 


where 





+E ROw) f 


as a special case of (101). According to (76), the cor- 
responding elements [u|3Cers|»] with respect to the 
orthonormalized set y, may be obtained by replacing 
¢, and R in (108) by ¢, and Q, respectively. We note 
that the change of basic set is easily carried out by 
means of the formulas 


=A-te, =A—'QA-}, 
(109) 
[| 3ote| J=Doap(A-) pa(a| Hest] 8) (A) py. 


A further discussion of these quantities for the aromatic 
compounds will be given in a following paper. 


(108) 


dx dx, 
Ti2 





4. ORDINARY AND VIRTUAL HARTREE-FOCK 
FUNCTIONS 


In treating the basic eigenvalue problem (39): 
Qeet (1). (1) =ene(1), (110) 


by, e.g., the method of expanding the eigenfunctions 
in a fixed set y, of order M, we have found that, in 
addition to the N ordinary solutions used in forming 
the density matrix (66), there are (M—J) solutions 
to (77) and (78) which are also eigenfunctions to Qers 


25R. G. Parr and R. S. Mulliken, J. Chem. Phys. 18, 1338 
(1950). 
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considered as a fixed operator. Hall and Lennard- 
Jones** have called these extra solutions virtual 
Hartree-Fock functions, and they have pointed out 
that they correspond to the existence of spin-orbitals 
containing “‘virtual” particles moving under the influ- 
ence of the real particles of the system without in- 
fluencing their motion in return; the “virtual” particles 
should therefore in some way behave like classical 
“test charges.” 

If M—o and the basic set gy, becomes complete, 
the number of virtual solutions becomes also infinite, 
but, without a detailed investigation, it is impossible 
to predict the character of the eigenvalue spectrum 
for w,, how large part of it will be discrete and how 
large part will be continuous, etc. It could happen 
that the ordinary and virtual solutions to (110) 
together would form a complete orthonormal set, and, 
in such cases, this set is of particular importance for 
discussing the properties of the system. 

If the N ordinary solutions to (110) are used for 
describing the properties of the system in the Hartree- 
Fock scheme, the virtual spin-orbitals could be used 
’ for improving this approximation by, e.g., the method 
of configurational interaction based on (I, 70) and 
(I, 71). It is evident that, if the basic set satisfies 
(110), considerable simplifications can be carried out 
in the fundamental matrix elements (I, 68). Increasing 
the number of virtual spin-orbitals taken into account, 
one can in this way obtain a séries of approximations, 
where the Hartree-Fock scheme represents the first 
step, as described by Mller and Plesset.?’ 

The more or less complete set of ordinary and virtual 
solutions to (110) may be used also in the one-particle 
space for solving eigenvalue problems of the same type 
as (110) but for other effective operators. One can 
then apply the standard scheme developed in Sec. 
3 (a), and the method is, of course, particularly con- 
venient if the operator under consideration is only 
slightly different from the basic operator Qers, in 
which case the procedure will be related to the “‘per- 
turbation method” developed by Peng.”® 

In treating the virtual solutions to (110), one should 
observe the difference between the exact potential (42) 
containing also exchange operators and Slater’s!?-!% 
average potential (53). In deriving (53), we have 
minimized the weighted “error sum” (I, 100) con- 
taining only the ordinary solutions, and this means 
that there might be considerable differences between 
(42) and its approximate form (53) when applied to 
the virtual spin-orbitals. For the sake of simplicity, 
let us consider the Coulomb potential, in which case 


26 G. G. Hall and J. Lennard-Jones, Proc. Roy. Soc. (London) 
A202, 155 (1950). 

27 C, Mgller and M. S. Plesset, Phys. Rev. 46, 618 (1934). 

28H. W. Peng, Proc. Roy. Soc. (London) A178, 499 (1941). 
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(42) and (53) take the forms 
2,2 2,1)P 
V(1)= ef p(2,2)—p(2,1) a 


2,2 
Vn(1)= ef 0(2,2)— {o(2,1)a(1, rn 1) 


Tie 





(111) 





(112) 


dx2, 
where the density matrix o has the character of a pro- 
jection operator fulfilling (13). We have further 


¥x(1), k=1, 2, oo; 
| (113) 
Oo” deeb Wad... 


f p(2tWa(2)asa| 


giving a characteristic difference between the ordinary 
spin-orbitals (k=1, 2, ---N) and the virtual spin- 
orbitals (k=N+1, N+2, ---). The potentials (111) 
and (112) are different particularly with respect to 
their asymptotic behavior when applied to virtual 
spin-orbitals, and, if x: is a point having a very large 
distance R; from the average position of the ordinary 
particles in the system, then we obtain 


N-1, k=1,2,---N; 


e 
Vop(1 1)~ 1)X 114 
(1)yx(1) _" ) N,  keN+1,N-+2 (l ) 


and 


é 
Vn(1 a(t) ~— well) X (N—1), for all k. (115) 


This result implies that the virtual spin-orbitals will 
be essentially different for the two potentials under 
consideration. We observe that the approximate form 
(112) corresponds to a screening of the nuclear frame- 
work which is one particle less than the screening cor- 
responding to (111), and one can therefore expect that 
the virtual solutions associated with (112) would be 
more stable and have a more extended discrete eigen- 
value spectrum than the virtual spin-orbitals be- 
longing to the exact eigenvalue problem (110). From 
this point of view, the Slater potential would therefore 
be more convenient in constructing a complete set.” 

In the MO-LCAO theory of molecules and crystals, 
sets of approximate virtual spin-orbitals have been 
evaluated in a few cases and used in configurational 
interaction, but, otherwise, the theory of the virtual 
Hartree-Fock solutions to (110) seems to be a field 
waiting for a closer investigation. 


5. TREATMENT OF IONIZED AND EXCITED STATES 


In considering the eigenvalue problem (32), we have 
applied the variation principle (I, 22) without any 
further restraining condition, and this means that we 
are actually investigating the state corresponding to 
the lowest eigenvalue of Qp, i.e., the ground state of 
this operator. In this section, we will now show how the 


® See also Part I, reference 17. 





QUANTUM THEORY OF MANY-PARTICLE SYSTEMS. II 


method may be extended to include also the treatment 
of ionized and excited states. 

In Sec. 1, we have shown that, in the Hartree-Fock 
approximation, the entire physical situation of the 
system is determined by the first-order density matrix 
¥(x1’ | X1) = p(x1’,x1), and we have not been able to give 
a particular physical importance to anyone of the basic 
sets of individual spin-orbitals y;, (k=1, 2, ---N) being 
connected with each other by unitary transformations. 
However, in considering also the ionized states, Koop- 
mans” has shown that a special physical meaning could 
be attached to the set representing the eigenfunctions 
to the effective operator (40), i.e., to the set for which 
the matrix A(/|k) of the Lagrangian multipliers is 
diagonalized, and, in this connection, he gave also a 
specific interpretation of the corresponding eigenvalues. 

In considering also other properties of the system, it 
could happen that another set of spin-orbitals will get a 
particular importance, and, as an example, we will 
mention the “equivalent” orbitals introduced by 
Lennard-Jones* in treating the problem of chemical 
valency. 





A(Qq)= f dur) (t,t) (2 yf O% 


Ap(1’,1) Ap(1’,2)  p(1’,3) 
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Here we will show that it is possible to extend Koop- 
man’s theorem for ionized states to be valid even for 
many-particle operators 2, of the form (I, 2). However, 
the main purpose of the investigation is to try to give 
a thorough treatment also of the problem of the excited 
states. 

Let us consider two different states of the same 
system having a fixed outer framework” and repre- 
senting different eigenstates of the same operator (I, 2). 
In the Hartree-Fock approximation, these states may 
be described by two single-determinant wave functions 
W’ and ¥, characterized by the invariants o’ and 9, 


respectively, where 
o’=o+dp. (116) 


In order to determine the differences 
A(Qop)=(Qop’)w—(Qop)wv, AQert=Nete’ —Dete, (117) 


we will put (116) into (21) and (40)-(42), and carry 
out the subtractions. We obtain the two basic formulas 


|Ap(1’,1) Ap(1’,2) 
Ap(2’,1)  Ap(2’,2) 


X10X2 


Ap(1’,1) Ap(1’,2) Ap(1’,3) 


+(3!)7 f M124 3) Ap(2',1)  Ap(2’,2)  (2',3)|+]Ap(2’,1)  Ap(2’,2) Ap(2',3)| pdarrdxadxs+--+, (118) 


Ap(3’,1) Ap(3’,2) p(3’,3) 
and 


Amur f a(1—Pr)do(2’2)da 
+ (20 f Mia(1— Pir Pad | 2 


(a) Ionized States 


Let us first consider the singly ionized states, where 
the system has the operator (I, 2) but contains only 
(N—1) particles. This means that one particle has been 
removed to infinity from the original system, and it 
seems therefore natural to assume that, at least in a 
first approximation, nothing new has been added to the 
system and that, in the terminology of Part I, Sec. 5, 
the matrix 9’ for the ionized state belongs entirely to the 
subspace defined by the matrix for the original state, or 
00’'0= 0’. The matrices 9 and 9’ are both “projection 
operators” fulfilling the relations 9’= 9, Tr(e)=N, 
(o')?= 9’, Tr(p’)=N—1, and for their difference 


*® T. Koopmans, Physica 1, 104 (1933). 

1 J. Lennard-Jones, Proc. Roy. Soc. (London) A198, 1, 14 
(1949); G. G. Hall and J, Lennard-Jones, Proc. Roy. Soc. (London) 
202, 155 (1950); J. Lennard-Jones and J. A. Pople, Proc. Roy. 
Soc. (London) 202, 166 (1950); J. A. Pople, Proc. Roy. Soc. 
(London) 202, 323 (1950); G. G. Hall, Proc. Roy. Soc. (London) 
202, 336 (1950); G. G. Hall and J. Lennard-Jones, Proc. Roy. Soc. 
(London) 205, 357 {1951); G. G. Hall, Proc. Roy. Soc. (London) 
205, 541 (1951); 213, 113 (1952). 


Ap(2’,2) p(2’,3) 
Ap(3’,2) p(3’,3) 


Ap(3’,1) Ap(3’,2) Ap(3’,3) 


Ap (2’,2) Ap (2’,3) 
Ap (3’,2) Ap (3’,3) 


fatto -+, (119) 





Ae=o’—9, we then obtain 
(Ag)?=—Ap, Tr(Ae)=—1. (120) 


In expanding Ag in a fixed basic set according to (I, 34), 
this implies that the matrix of the coefficients has a 
single eigenvalue equal to —1 and all the others 0. By 
transforming this expansion to “natural spin-orbitals” 
analogous to (I, 74), we then obtain 


Ap (x1,X2) = gt (xi)y(x2), 


where (x) is an undetermined spin-orbital. Our basic 
assumption leads therefore automatically to a fac- 
torization of the difference Ao, which will essentially 
simplify the discussion, since all determinants in (118) 
which are of at least the second degree in Ao will now 
vanish identically. 


(121) 


Jn considering molecules and crystals, this means that we 
treat only “vertical” transitions with the nuclei fixed in the same 
positions in both states. If the nuclei are in equilibrium in one 
state, they are therefore usually outside their equilibrium positions 
in the other states; compare the Franck-Condon principle. 
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Combining (118) and (121), we obtain 


(122) 


Aap) == f ¥*A) Quer, 


(op!) nw=(Dop)u— f Y*(1)Qeee(1Y(1)der. (123) 


According to the variation principle (I, 22), the eigen- 
states of the ionized system are associated with the 
extreme values of (Qop’)w, and, since (Qop)m is a con- 
stant, these occur simultaneously with the extreme 
values of the last term in the right-hand member of 
(123), i.e., for spin-orbitals satisfying the relation 


Qeet(1) (x1) = wy (x:), (124) 


where the variation principle is applied a second time to 
the one-particle space. The eigenfunctions y; to Der are 
therefore of particular importance in constructing the 
ionized states, and, if the wave function for the original 
state is a determinant built up from the functions y, 
vo, -+ Wy, then the wave function for the ionized state 
(k) may be obtained by striking away the column 
containing the spin-orbital y;, together with an arbitrary 
particle row. Combination of (123) and (124) gives 


further 
(Qop’)aw = (Qop) av —~ Wk. 


This is a generalization of Koopmans’ theorem™ to 
include also many-particle operators,* and it gives a 
rather visual interpretation of the eigenfunctions y; 
(k=1, 2, ---N) and the corresponding eigenvalues «,. 
We note that the result is not exact even within the 
Hartree-Fock approximation, since it is based on a sim- 
plifying assumption leading to (121). A proper treat- 
ment could be carried out by solving the Hartree-Fock 
equations for the system of (V—1) particles by apply- 
ing the methods in (3a) to, e.g., the basic set formed by 
the ordinary and virtual eigenfunctions of y% for the 
system of N particles.** However, as Mullikan‘ has 
pointed out, due to cancellation of errors, it seems 
likely that the values of A(Q) given in (125) will show 
better agreement with experimental data than the 
refined quantities. 


(b) Excited States 


Let us now turn to the problem of the excited states 
which, according to our opinion, has not been too 
satisfactorily treated in the literature. If @ and 9’ are 
the invariants associated with the original and the 
excited state, respectively, we have 


o’ = e+Ao=0t+Ao;— Agi, (126) 


% Jt should be noted that, in several textbooks and surveys, 
Koopmans’ theorem have been treated rather superficially, 
proving only the relation A(Q)»)=—w, without considering the 
extreme value properties. 

*% See also G. G. Hall and J. Lennard-Jones, Proc. Roy. Soc. 
(London) A202, 155 (1950), where this problem is treated by using 
perturbation theory. 


(125) 
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where the quantity +Ag; indicates that something 
essentially new has been added in the excited state, 
whereas the quantity —Ap; means that something 
previously existing in the original state has been taken 
away. In a first approximation, we will assume that the 
three matrices 


otAg;, 


are all idempotent (9’=0, etc.), and that the first 
belongs to the subspace defined by the second, and the 
second to the subspace defined by the third; see (I, 88). 
It is then easily shown that the matrices Ag; and Ap; 
are also idempotent with 0 and 1 as their only eigen- 
values, and their traces are then also integers. For a 
single excitation, their traces are 1, and the coefficient 
matrices in their expansions (I, 34) have therefore a 
single eigenvalue equal to 1 and the others equal to 
zero. By transforming Ag; and Ag; to diagonal form 
according to (I, 74), we then obtain 


Ap:(1,2)=y*(1)yi(2), Aps(1,2)=ys*(1)v;(2), 


where y; and y; are natural spin-orbitals to be deter- 
mined. Under these simplifying assumptions, Ag may 
therefore be written in the form 


Ap(1,2)=93* (1)ys(2) —¥s* (1)y(2) 


for a single excitation, and we will now see that this 
special form for Ag leads to a considerable simplification 
of our discussion. 

The basic quantities (118) and (119) are expressed 
in terms of determinants of the elements @ and Ag, and 
we note that all determinants of third or higher degree 
in Ag will vanish identically independent of their orders, 
since they may be expanded in determinants containing 
two or more columns of Ag; or Ao;, which vanish due 
to the factorization in (128). By putting o= 9’— Ag into 
(21) and repeating the subtraction, we may also obtain 
a new form for A(Q,») expressed in Qer;’, 9’, and Ao, 
which is analogous to (118) but has minus signs for all 
the determinants of the second degree in Ap. By adding 
the two expressions for A(Q), the determinants of the 
second-order cancel, and the higher order determinants 
combine to determinants of the third degree in Ap, 
which will then vanish identically due to the argument 
given above. In this way, we obtain the simple formula :f 


= Aoi, e,; (127) 


(128) 


(129) 


A(Qop)=4 f {Qt'(1) + Mett(1))Ap(1’,1)dery. (130) 


By using (119) and (128), we may further derive the 


{We note that, in Eqs. (130)-(132) and (137)-(138), the 
“prime” has been used with two different meanings, which must 
not be confused : the prime on Mrs’ (1) indicates that the operator 
is associated with the excited state, whereas the prime on x1’ in 
the integrands indicates that the operators do not work on this 
coordinate, which afterwards has to be put equal to x1. 





relat 


whic 


A(Q 


or fi 


(Qop 


high 
valu 
Occt 
seca 
(13: 


For 


3 
195: 
: 36 


QUANTUM THEORY OF MANY-PARTICLE SYSTEMS. II 


relation : 

facuu(tytani(t’t)-+aei(',1)}de1=0, (131) 
which, in combination with (130), gives 
A(Qep)= fu! (1), (0 2d 


Ps f Oert(1)Ap;(1',1)dxs, (132) 
or finally 
(Qop’ daw = {Qop w+ f ¥3* (1) Qete’ (1) ;(1) dar 


a f VA (1) Qee(ys(1der. (133) 


According to the variation principle (I, 22), the 
higher eigenvalues of {2,, are associated with the extreme 
values of (Qop’)w, and since (Qop)w is a constant, these 
occur simultaneously with the extreme values of the 
second and third terms of the right-hand member of 
(133), ie., for functions y,’ and y; satisfying 


ett’; =aj'P;/, Qeti=wi. 
For the excitation difference, we then obtain 
(Qop’)w— (Qop w=; — wi. 


The process may therefore be described as an excita- 
tion i—7 of an entire particle from an occupied spin- 


(134) 


(135) 





Qete’ (1) = Qu) + fOu(t — P12)Ap(2',2)dx2 


+2) f Ou(1— Pur Pu)| 2 


Ap(2’,2)  p(2’,3) 
Ap(3’,2) p(3’,3) 
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orbital ¥,, being an eigenfunction to 2.1, to an unoc- 
cupied spin-orbital y,’, being an eigenfunction“to Qerr’. 
We note particularly that, in (135), the quantity w,’ is 
a spin-orbital eigenvalue associated with the operator 
Qert’ for the excited state, and that Qe,’ may have an 
eigenvalue spectrum which is rather different from the 
corresponding spectrum for Qers.*° It is evident that the 
arguments leading to the naive form (w;—w,) for the 
excitation difference A(Q,,) must be erroneous, but also 
that it is possible to preserve the visuality of the theory 
by introducing the operator Qers’.** 

Let us now construct the wave function W’ for the 
excited state by using (129) and (134). If the function 
W for the ground state is a determinant built up from 
the eigenfunctions y, (k=1, 2, ---N) to the operator 
Qett, then W’ is the determinant obtained from WV by 
replacing the column containing the spin-orbital y; by 
a column containing the excited spin-orbital y,’. Using 
(119) and (128), we may derive the relation 


fvsracub ate = 0, (136) 


which shows that the spin-orbitals y,’ and y; satisfying 
(134) will still be orthogonal, and, according to (I, 39), 
the same is then true also for the total wave functions 
W’ and W. The auxiliary condition to the variational 
principle (I, 22) is therefore fulfilled. 

Even under the simplifying assumptions leading to 
(128), the treatment of the excited states is a rather 
complicated problem due to the unknown character of 
the operator Qer’. According to (119), Qers’ is given by 
the formula 


Ap(2',2) Ap(2’,3) 
Ap(3’,2)  Ap(3’,3) 


| aeaest- -+, (137) 


but, as usual in the Hartree-Fock scheme, the problem of finding its eigenfunctions has a nonlinear character, 


since Qerr’ depends on Ap and consequently also on y,,’. 


In principle, this problem could be solved by the general method developed in (3a) by expanding the eigen- 
functions y;’ to Derr’ in the set formed by the ordinary and virtual eigenfunctions to Qefs. In a first very rough ap- 
proximation, the eigenvalues to Qerr’ are then given by first-order perturbation theory: 


w= f ¥7* (1) Qeet’ (1) ;(1) dai =0j— f Qi2 


= fon Ap;(2’,1) Ap;(2’,2) Ap;(2’,3) dx \dx2dx3— lates 


Ap;(1’,1) Ap;(1’,2) 
Ap;(2’,1) Ap;(2’,2) 


X 0X2 


Ap;(1’,1) Ap;(1’,2)  Ap;(1’,3) 
(138) 
p(3’,1) 


p(3’,2) p(3’,3) 


mene, e.g., J. C. Slater, “Technical Report No. 6 of the Solid-State and Molecular Theory Group at M.I.T.,” April 15, 


1954 Fama lished). 


*% Compare C. C. J. Roothaan, Revs. Modern Phys. 23, 69 (1951), p. 80. 
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where we have used (128). Using the notations (49), 
this may also be written 


; . : 
0; ~aj— Vii-de Vj#— ees, 


(139) 


which result corresponds to the formula for the two- 
particle case and average singlet-triplet state given by 
Roothaan.** However, first-order perturbation theory is 
probably not accurate enough for treating the question 
of the nature of the excited states, and a much more 
detailed study seems therefore to be necessary.* In a 
really accurate theory, one must finally remove the 
simplifying assumptions leading to the factorization in 
(128). 


(c) Limitations of the Present Theory 


Up till now we have assumed that the total wave 
function for the state and system under consideration 
may be represented by a single Slater determinant. We 
note that, in most cases, this means a rather hard 
restriction on the validity of the Hartree-Fock scheme; 
only in exceptional cases can we consider, e.g., pure 
spin states, since a single determinant will in general 
represent a mixture between several multiplets. 

In order to treat this problem in greater detail, we 
will assume that the natural spin-orbitals involved have 
either plus or minus spin and write the fundamental 
density matrix o, defined by (35), in the form 


p(X1,X2) =p4(11,82)a(s1)a(S2)+p_(11,82)8(s1)B(s2), (140) 


where we have separated the two groups of orbitals 
having different spins. The matrices 9; and o_ are 
“projection operators” in the ordinary r-space satis- 
fying the relations 


Tr(o)=N4, 
Tr(e_)=N-, 


04°= 04 


(141) 
o*= 9. 


where V, and N_ are the number of orbitals associated 
with plus and minus spin, respectively. Independent of 
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the way in which we have chosen our two sets of orbitals, 
we will further let » denote the number of “doubly 
occupied orbitals,” defined by the integral 


(142) 


y= [ pslrardo-(rapsddordos, 


The average value of the total spin S? (measured in 
units of /) with respect to a single determinant, charac- 
terized by the invariant (140), is given by the general 
formula (I, 14), and, since the second-order density is 
represented by the determinant (19), we obtain after 
some elementary calculations: 


(S?)w=4(NV4—N_)?+ (GN —»). (143) 


Only in the special cases V;= N_=v=N/2 and N,=N, 
N_=v=0, we have, respectively, 


(S#)w=0, (S?)w=3N (GN+1), 


corresponding to the pure spin states of lowest and 
highest multiplicity. 

It is therefore evident that the ordinary Hartree-Fock 
scheme cannot properly treat states and systems 
showing spin or orbital degeneracies, since, in such 
cases, the wave function cannot be represented by a 
single determinant. It is also well known that corre- 
lation effects associated with particles having different 
spins are not taken into account in constructing the 
single-determinant wave function (2). These weaknesses 
in the present theory may be removed only by con- 
sidering wave functions to be sums of Slater deter- 
minants, i.e., by using the method of “configurational 
interaction” described in Part I. However, between the 
ordinary Hartree-Fock scheme and the exact method 
of configurational interaction, there seems to exist also 
an intermediate stage of “fixed” configurational inter- 
action, where it is possible to preserve part of the 
physical simplicity and visuality characteristic of the 
Hartree-Fock method. This problem will be treated in 
a following paper of this series. 


(144) 
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In treating a system of N antisymmetric particles, it is shown that, if the total Hamiltonian Wop is 
degenerate, the eigenstates of the operator used for classifying the corresponding degenerate states may 
be selected by means of a “projection operator” ©. If the total wave function is approximated by such a 
projection of a single determinant, the description of the system may be reduced to the ordinary Hartree- 
Fock scheme treating this determinant, if the original Hamiltonian is replaced by a complete Hamiltonian 
Mp= Ot FC.pO containing also many-particle interactions. This approach corresponds to a “fixed” con- 
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Fock approximation. The idea of “‘doubly filled” orbitals is abandoned, and the orbitals associated with 
different spins will automatically try to arrange themselves in such a way that particles having antiparallel 
spins will tend to avoid each other due to their mutual repulsion. 





HE Hartree-Fock scheme for treating a system of 
N antisymmetric particles has a physical sim- 
plicity and visuality due to the fact that a single deter- 
minant is the simplest wave function based on the 
“independent-particle” model with the correct sym- 
metry. However, it is well known that this scheme 
cannot properly treat states and systems having spin 
or orbital degeneracies and that further “correlation 
effects” associated with particles having different spins 
are not taken into account in constructing the single- 
determinant wave function. The purpose of this paper 
is t6 show that there exists a form of “fixed” configura- 
tional interaction, based on the use of “projection oper- 
ators,” which may be considered as an extension of the 
ordinary Hartree-Fock scheme to include degenerate 
systems and correlation effects, since it preserves the 
physical simplicity and visuality of the original scheme. 
This extended scheme is an intermediate stage between 
the ordinary Hartree-Fock approximation and the exact 
method of complete configurational interaction, treated 
in Parts II and I, respectively.! 


1. TREATMENT OF SPIN AND ORBITAL 
DEGENERACIES BY PROJECTION 
OPERATORS 

Let us consider a system of N antisymmetric particles 
having a Hamiltonian operator in configurational space 
of the form 


1 1 
Kop= Hey +L ad By aes LD’ Kiat--, (1) 
a 1 


! ijk 


analogous to (I, 2). If its eigenstates are degenerate, we 
will assume that they may be classified by an operator A 
having a finite number of discrete eigenvalues Au, 


* This work was supported in part by the U. S. Office of Naval 
Research under its contract with Massachusetts Institute of 
Technology. 

1 P. O. Léwdin, preceding paper [Phys. Rev. 97, 1474 and 1490 
(1955) ]. These papers are in the following referred to as Parts I 
and II, respectively. 


Ne, ++ An. We let further V be an arbitrary wave function 
associated with the space of degeneracy, having an 
expansion of the form 


¥=> AN, (2) 


k=1 


where W;, is an eigenfunction to A belonging to the 
eigenvalue A,. We note that, since the factor (A—A,) 
annihilates the term for k= in this expansion, the 
operator 

kl 


O.= J] (A—Ax)/(Ar—Ax) (3) 


k=1,n 


takes out only the term for k=/, i.e., it gives the 
“orthogonal projection” of Y on the eigenstate of A 
having the eigenvalue },: 


Ow=AM, (4) 


where we have used the terminology of Part I, Sec. 5. 
The operator ©; is therefore also a projection operator, 
which, in matrix representation, fulfills the Cayley- 
Hamilton equation (A—A,)0;=0. Since the factors in 
(3) may be written in the form 


(A—Ax)/Qr— Az) = 14+ (A—AD)/Ar—Ax),_—(S) 
we can then easily derive the relation 
O; = 01, (6) 


characteristic for the projection operators. 


Spin Degeneracies 


As an example, we will now consider the projection 
operators associated with the spin degeneracy of N 
antisymmetric particles having spin one half, as elec- 
trons, or nucleons. Measuring the spin in units of 


2 J. v. Neumann, Math. Grundlagen der Quantenmechanik (Dover 
Publications, New York, 1943), p. 41. 
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h, we know that S? has the eigenvalues /(/+-1), where 
l=N/2, (N/2)—1, (N/2)—2, ---0 or } depending on 
whether J is even or odd. According to (3), the operator 
for selecting a state of multiplicity (2/+-1) is then 


ano=F] (SHE 1)}/U+1)— e+), (7) 


where the product is to be taken over all k+/ from 0 
or $ to N/2. For S* we may here use one of the ex- 
pressions 


S=> S;:S, 
if 


=S*S-+53-S, 
=— WN (N—4)+3 2D (1+9;-6;) 


i<j 


=—4}N(N—4)+) Pas. 


i<j 


(9) 


We will now investigate the effect of the spin pro- 
jection operator (7) on a single determinant built up 
from N spin-orbitals. For the sake of simplicity, let us 
assume that V is even (V=2n) and that we are inter- 
ested in states with S,=0, i.e., having an equal number 
of a and 6 spins. Let us further assume that we have V 
orbitals @;, d2, -*-@n, by, be, -- +b, at our disposal, and 
that the first ” orbitals are occupied by particles with 
plus spin and the last » orbitals are occupied by par- 
ticles with minus spin. The corresponding Slater deter- 
minant 


(N !)-? det{aia,aea, «+ dnc] b1B,b28,--+bnB} (10) 


may then be denoted by the abbreviated symbol 


{ac- - -a| BB: - -B} (11) 


showing the spin distribution over the WN orbitals taken 
in their given order. 

Pratt? has recently described a spin-operator for- 
malism for constructing singlets, but, since he has not 
used the projection operator idea, his treatment is con- 
siderably more complicated than here. Using his nota- 
tions, we will now introduce the quantities 


To= {aa «a8: --8}, 
T1={ (Baa: ++)-+(aBa-++)++++| (@BB--- 


T2={ (8Ba---)+(BaB---)+---| (aap: -- 
+ (Ba: - 


T,={B88---Blaa---a}, 
3G. W. Pratt, Jr., Phys. Rev. 92, 278 (1953). 
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where 7; is the sum of all different determinants, ob- 
tained by & interchanges of the spin functions between 
the two originally given groups (a) and (8) of orbitals; 
T» is identical with the given determinant (10). In (12), 
we have used Pratt’s symbolic way of “multiplying” 
determinants: 7; is the “product” of two factors, each 








an n ‘ 
containing ( ) terms, and 7; consists therefore of a 


nN 2 
sum of ( ) determinants. 


In order to evaluate S*7;, we will use the form (9), 
where we observe that }>:<; Pi;7 commutes with the 
antisymmetrization operator used in forming the deter- 
minant (10) from a simple product. Counting the pos- 
sible spin interchanges, we find that S*7, may be 
expressed in T;_1, 7;, and T7441 with the following coef- 
ficients : 


i e( b | / (,",)- (n—k+1)?, 


Ty: —3N(N—4)+n(n—1)+2k(n—k) 
=n(2k+1)—2k?, 


re -O(") /(," )= oer 


which gives the basic formula 


ST,= (n— k+ 1)?7y1 
+[m(2k+1)— 2k JT. + (R+1)?Tig1, (14) 
with the definition 7_1=7,41=0 understood. Since the 


projection operator (7) is a polynomial in S*, we have 
then proved that there exists an expansion of the form 


(13) 


(15) 


n 
AWOET I=L cTy. 
k=0 


The coefficients in this expansion may be determined 
by using (14) and the relation 


S(d% Ce x) =1(1+ 1)> Cel k, (16) 
which leads to the recurrence formula 


(m—k)*Cey1+[n(2k+1) —2k?—1(1+-1) Jen 
+ Rea =(). (17) 


For the important cases of lowest and highest multi- 
plicity (=0 and /=m), we obtain particularly 


n nN —l 
19 T 9= cy 5 (—1»( ) Ti, (18) 
k=O k 


(19) 


ntl) QT) = co + Ti, 
k=0 


and, i 


a /e 


6 /e 


The | 
arbitr 
plicat 
For 
sentia 
occup 
single 


and 
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and, in general, we have for the first coefficients 


a /ceo=(+1)—n/n’, 
os” foo =[P(L4-1)*— (4n—2)1(141) 
+2n(n—1) \/n?(n—1)?, 


The explicit form of the higher coefficients for an 
arbitrary / will be omitted, since they are rather com- 
plicated. 

For the applications, the value of co is usually unes- 
sential, but, by considering a system with doubly 
occupied orbitals, i.e., a,=0;, the value of co for the 
singlet operator is easily determined: co = (n+-1)—. 

We note that (18) is just the Clebsch-Gordan ex- 
pansion from which Pratt® started his investigations, 
and we have then shown that our projection operator 
for constructing singlets 


'o= (1—S?/1-2)(1—S?/2-3)- + -[1--S?/n(n+1)], (21) 


gives the same expansion as Pratt’s rather complicated 
spin-operator; except for a constant factor, the two 
operators must therefore be identical. 


The Energy Formula for Various Multiplets 


As an example of the applications of the projection 
operator formalism, we will calculate the energy of a 
spin state of multiplicity (2/+-1) having a wave function 
obtained by projection of a single determinant 


V= HOT, (22) 


for a spin-free Hamiltonian of the special form (I, 11) 
with 3C;;=e?/r;;. Since © is an Hermitean operator in 
our configurational space, we obtain by using (7) and 
(15): 


fw Hop (dx) = f To* (ot HopO) T (dx) 
= ca? f Teta (dx), 
k=0 


and similarly : 


(24) 


f VV (de)= x AC f T¢*T (dz). 


In the simple case when all orbitals a1, a2, «+ +dn, 
bi, be, -++by involved are strictly orthogonal (as they 
would be for singly filled molecular orbitals), there will 
be contributions to the energy only for k=0 and k=1 
and to the normalization integral only for k=0 accord- 
ding to (I, 49). By using the first relation (20), we 
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therefore obtain 


L(i+1)— 
(Kop) w= f Tit XeTo(ds)+—— 


nN 


Xx fre HopT 1 (dx) 


(25) 
= Ko+L (u| Hilu)+3 X (ur| Iie] ur) 


|| spins 


—} DX (ur| 3is| mu) 
uy 


L(l+1)—n a 
————_ © (ab;| Hiz|b;0,), (25) 
n? i j=l 
where the spin is eliminated in the matrix elements, 
and pw and » are to be summed independently over all 

orbitals a;, 5;. 

Since the exchange integrals for Coulomb forces are 
always positive as “self-potentials,” formula (25) shows 
that, under our specific assumption of orthogonality, the 
state with the highest multiplicity will always have the 
lowest energy. This extension of Hund’s rule was 
recently proved in a still more complete form by Koster* 
by using Dirac’s vector model. 

Spin degeneracy problems have previously been 
treated by either Slater’s determinant method’ or 
Dirac’s vector model.* We note that, even if we have 
taken over some elements, as the spin permutation 
operators in (9), from Dirac’s theory, our approach is 
firmly based on Slater’s determinant idea with the 
wave function (22) expressed as a sum of determinants. 
However, there is also a connection with the vector 
model, and, if we form the mean value of the energy 
(25) for all possible distributions of a and 8 electrons 
over the orbitals a1, @2, --+@n, 51, be, «+ +bn, we obtain 


((3Cop) av) v= Heot+Dy(u| Hy | h) 
+3 Dw’ (ur| 5Ci2| uv) 
1 Tesi 1) —| 


al" N(W—1) 





LD’ (uv| 5i2|uv), (26) 


uy 


which is just the average energy given by the vector 
model.” 

Let us now consider all pure spin states which may be 
constructed according to formula (22). We note that 


4G. F. Koster, Quarterly Progress Report of Solid-State and 
Molecular Theory Group at M.I.T., July 15, 1953 (unpublished), 


p. 37. 

5 J.C. Slater, Phys. Rev. 34, 1293 (1929). 

6 P. A. M. Dirac, Proc. Roy. Soc. (London) A123, 714 (1929); 
Principles of Quantum Mechanics (Oxford University Press, 
Oxford, 1935); see also references in E. M. Corson, Perturbation 
Methods in the Quantum Mechanics of n-Electron Systems (Blackie, 
London, 1951). 

7See also F. Bloch, Z. Physik 57, 545 (1929), p. 550; and W. 
Heitler, Z. Physik 47, 835 (1928). 
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N | ia 
. ) different ways of distributing the equal 


there are 


number of a and £ electrons over the given orbitals, 
and each distribution corresponds to a determinant 7», 
from which a series of pure spin states may be con- 
structed by using (22). On the other hand, Bloch’ has 
shown that the independent number of spin states of 
multiplicity (2/+1) is only 


- hen 1 tvs n . —- a 1 *y 


N 
and this means that the a states given by (22) 


cannot be linearly independent. In the general method 
of “configurational interaction,” it is important that the 
basic functions form a linearly independent orthonor- 
malized set, and this leads to the problem how to form 
such a set from the functions given by (22). For the 
singlets, Pratt® has solved this problem by using the 
“branching diagram,” and his formulas may also be 
translated into the projection operator formalism. How- 
ever, a more general approach may be obtained by 
treating all the functions given by (22) on an equal 
basis and to construct the independent orthonormalized 
set by a slight generalization of the orthonormalization 
procedure previously described by the author.® 

We will later see that part of the degeneracy problem 
mentioned above will disappear when we start to take 
“correlation effects” into account; see Sec. 3. The 
orbitals at our disposal will then be naturally divided 
into two groups (a;) and (0;), associated with different 
spins, since the particles with different spins try to 
avoid each other. In this connection, it is also necessary 
to generalize formulas (23) and (24) to basic sets having 
nonorthogonality integrals essentially different from 
zero. 


Applications to Systems Having Cyclic Symmetry 


As another example of the projection operator for- 
malism, we will consider the problem of the form of the 
total wave functions and the corresponding spin- 
orbitals in a system having a certain cyclic symmetry of 
order m, like a crystal or a benzene ring. Let us assume 
that © is the basic symmetry operation, which fulfills 
the condition 


O"=1. (28) 


This may be considered as a Cayley-Hamilton equation 
in a certain matrix representation, and it is then clear 
that, in such a representation, the only possible eigen- 


values of © are given by the roots of unity: 
6;=exp(2xij/m), j=9, 1, 2, -+-m—1. (29) 
* See Part II, footnote 11; more details will be given in a forth- 
coming paper. 
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According to (3), the corresponding projection operator 
is then given by 


O.=m"(1— @")/(1—0;"0) 


m—1 
=m > 67-0. (30) 
k=0 
If go is an arbitrary spin-orbital without any par- 
ticular symmetry properties, its projection defined by 


m—1 m—1 
¥i=O1go=m™ py 9; * OF po=m™ Xu geetting,, (31) 
=0 


has the correct cyclic symmetry and fulfills the relation 
Opi= erty. (32) 


The spin-orbital (31) is, of course, nothing but the 
function constructed by Bloch’ by solving a secular 
equation, and (32) is the so-called Bloch condition. The 
same arguments may also be applied to the total wave 
functions. 

We note finally that, in treating wave functions 
formed by projection operators, we may use the basic 
formula for adjoint matrices: 


fvorvecan = f vor (ae), (33) 


only if the operator © is defined in the coordinates of 
the configuration space (dx) under consideration. The 
simplifications rendered by (33) in combination with 
(6) may therefore be obtained only if © is operating in 
the ordinary configurational space, or, if by some formal 
arguments, this space may be extended to include also 
the variables contained in 0. 

The method described in this section is of a quite 
general character, and it may be used for treating 
degeneracies associated with the isotopic spin, the 
angular momentum, etc. Further applications will be 
given in a following paper. 


2. AN EXTENSION OF THE HARTREE-FOCK 
METHOD TO DEGENERATE SYSTEMS 
The importance of the ordinary Hartree-Fock scheme 
depends partly on the fact that a single Slater deter- 


minant, 
Vo=(N!)-4 det{y1,y2,-- ww}, (34) 


is the simplest wave function having the correct anti- 
symmetry property which corresponds to the idea that 
N particles are moving independently of each other in 
the N spin-orbitals y, Ye, ---~w. As shown in Part II, 
this scheme has therefore a physical visuality which is 
useful in the interpretations and in constructing ionized 
and excited states. However, if the system has spin or 
orbital degeneracies, there is a difficulty connected with 


°F. Bloch, Z. Physik 52, 555 (1928). 
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the fact that the total wave function must be expressed 
as a sum of Slater determinants, and part of the visuality 
seems then to be lost. We will here show that this 
problem may be solved by treating the degeneracy by 
the projection operators introduced in the preceding 
section. 

Let A be the operator which is used for classifying the 
degenerate states (at least in a first approximation), and 
let ©; be the projection operator (3) for selecting an 
eigenfunction belonging to the eigenvalue \;. For the 
sake of simplicity, we will further assume that A is 
operating only in the ordinary configuration space, 
described by the coordinates x;, Xz, ---xw. The wave 
function 

V=0M, (35) 
is then usually a sum of Slater determinants, but we 
note that it is still invariant with respect to unitary 
transformations of the two groups of orbitals associated 
with the two types of spin. It must therefore be possible 
to describe the properties of the system by means of the 
fundamental invariant 


p(X:,X2) =2 We* (Xi) Px (X2), (36) 


defined by (II, 12 and 35) and fulfilling the relations 
o’=o and Tr(e)=N. Forming the average energy and 
using (33), we obtain 


f v* HopV (dx) — Vo* (ot HopO)Vo (dx) ’ 
(37) 
f VV (dx) = f Vet (Ot) Wo(dx), 


ie., the same expression as for a single determinant and 
a “composite” Hamiltonian of the form 


Qop = Of HopO, (38) 
where one has.also to take the normalization condition 
into proper account. If the operators 3, and A strictly 
commute, the composite Hamiltonian is reduced to the 
form Qop= 5CopO, because of the relation (6). 

The physical situation of the degenerate system may 
therefore be described by either the ordinary Hamil- 
tonian 3, and a wave function WV being a sum of deter- 
minants, or a composite Hamiltonian 0,, and a wave 





(op) w= f Vo" Qep¥o(dx) 
p(t’,1) p(1’,2) 


=. + f 2000 A)de+d f On p(2',1)  p(2’,2) 
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function Vo being a single determinant. The formalism 
is parallel to the basic description of time dependence 
in quantum mechanics: in the Schrédinger representa- 
tion, we are considering a fixed Hamiltonian and a time- 
dependent wave function, but, in the Heisenberg repre- 
sentation, we are instead using a time-dependent 
Hamiltonian and a fixed wave function. Both descrip- 
tions are entirely equivalent. 

However, the formalism using the composite Hamil- 
tonian has a certain advantage, since we may directly 
take over the mathematical apparatus of the ordinary 
Hartree-Fock scheme. The projection operators are 
symmetric with respect to the coordinates of the par- 
ticles involved, and, if their explicit form (3) is known, 
the composite Hamiltonian (38) may be expanded in 
the form 

Qop =Q@td 04— Py raat pty Qin: *y (39) 
Dy 8 3! tak 


containing also many-particle operators. However, in 
Parts I and II, we have already extended the theory to 
include such many-particle interactions. 

Under specific assumptions about the spin-orbitals 
in (34) the expansion (39) may sometimes be reduced 
to comparatively simple forms. As an example, we 
will mention that, if all basic orbitals in (34) are 
strictly orthogonal, the combination of the equation 
T,= (S?—n)T> and the first relation in (25) leads to a 
composite Hamiltonian of the form 


Li+1)—n 
Os cee 
n2 


1 
Qop = Hoyt L ior; 


(S?—n) p tig Ii; 
(40) 


Since S? is given by (9), it contains many-particle 
operators up to the order 4; the normalization of ¥ is 
here taken into proper account. 

The investigation in Part IT tells us now that it is 
possible to extend the ordinary Hartree-Fock method 
to operators containing also many-particle terms and 
consequently also to include treatment of degenerate 
system by using the composite Hamiltonian (38). We 
will give here a summary only of the most important 
results. For a single determinant Wo, all higher-order 
density matrices may be expressed as determinants of 
the first-order density matrix (36), and, according to 
(II, 21), we then obtain for the average energy 


Xe 


- p(1’,3) 
hh Gil - -_ 
if (31) 


p(3’,3) 


dx,dxedxs+---, (41) 
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where, after the operations in the integrands have been carried out, we have to put all x,’=x;. The variation prin- 
ciple 6(3Cop)w=0 leads then to extended Hartree-Fock equations (II, 40-42) of the form 


Qeet (1) Ye (x1) =2 ¥2(x1)A(2| R), (42) 


where the one-particle operator Q.¢¢ is given by 


u(t) = Ot fO(1—Pudo(22)dset4 f Qus(1— Pie Pu) 


and P,; is the ordinary permutation operator for inter- 
changing the coordinates x; and x;. Due to the invari- 
ance of (36), we may carry out a unitary transformation 
of the two groups of orbitals associated with different 
spins in the basic set of spin-orbitals ¥;., which brings 
the Hermitean matrix of the Lagrangian multipliers 
A(1|k) to diagonal form, and, in this special case, we 
obtain the eigenvalue problem 


Mote (1) a (X1) = exe (x1). 


In forming (41) and (43), we have assumed that the 
normalization integral in the second relation (37) is 
equal to 1, but we note that there are no principal dif- 
ficulties in treating also the general case when this 
integral has another constant value or is represented 
by an expansion of the form (41) for the operator 0. 
The expectation value of the energy, E=(Hop)w, is 
then given as the quotient between the two quantities 
(37), and, in applying the variation principle, one has 
to vary also the denominator; the auxiliary condition 
may be expressed in the form (II, 58). It is easily shown 
that the best spin-orbitals y; are again determined by 
an eigenvalue problem of the form (44) but with Qerr(1) 
replaced by the slightly more complicated one-particle 
operator 


(2.u(1)—Bo.u(t)} / [ vetove(da), 


(44) 


where Qe¢(1) is formed from the expansion of the pro- 
jection operator © in the same way as (43) is formed 
from (39). The mathematical details of this more gen- 
eral case will be further discussed in a forthcoming 
paper. 

Since the operator Q.¢¢ is Hermitean, the eigenfunc- 
tions belonging to different “orbital energies” in (44) 
are automatically orthogonal. This simple result 
depends on the fact that we are here considering spin- 
orbitals without restraining conditions on the two 
groups of orbitals involved and that, in this connection, 
we have made us free from the idea of “doubly occupied” 
orbitals. Our result should also be compared with a 
previous discussion by Hartree and Hartree,” where, in 
investigating the first excited states *P and 'P of beryl- 
lium, they were forced to keep a nondiagonal element 
A(1s|2s)0, since they assumed the 1s-orbital to be 


1D. R. Hartree and W. Hartree, Proc. Roy. Soc. (London) 
A154, 588 (1936), particularly p.- 594. 


p(2’,2) p(2’,3) 
p(3’,2) p(3’,3) 


doubly filled. However, Hartree and Hartree suggested 
also that it would be both physically and analytically 
significant to introduce different 1s-orbitals for the two 
spins involved, and this extension of the theory has 
here been performed in a quite general way. We will 
later see that this distinction is of great importance for 
discussing also the correlation effects. 

Our results show that, even for a degenerate system, 
we may keep the idea of the existence of an “effective” 
Hamiltonian, but, due to the degeneracy, all terms in 
this Hamiltonian may now contain couplings between 
several spin-orbitals corresponding to the occurrence of 
many-particle forces. In the “independent-particle” 
model, a degeneracy may therefore be described by 
replacing the ordinary effective Hamiltonian Wer: by a 
composite effective Hamiltonian Q.¢; containing “de- 
generacy couplings” of many-particle character. 

The composite effective Hamiltonian Qere given by 
(38), (39), and (43) has not only a formal character 
but also an essential physical meaning, which becomes 
clear when investigating excited and ionized states. Let 
us consider two states associated with the same eigen- 
value of the classifying operator A and therefore having 
wave functions ¥’ and W which are obtained from single 
determinants by the same projection operator: 


V=0WV, 


dxpdx3+-:-, (43) 





V'=0%7, (45) 
where the determinants are characterized by the 
invariants p’ and p, respectively. In a first approxima- 
tion, we will then assume that, according to (II, 128), 
the difference Ap= p’—p may be expressed as the sum 
of two factorized terms: 


Ap(1,2)=9;* (1) (2) —¥.* (1) yi (2). 


Using (II, 133) and assuming that all basic orbitals 
are strictly orthogonal, we then obtain 


(46) 


(top )w= (Sop) f U*(1) Qeet(1)v5(1der 
zs f YA (1)Qeee(1ya(I)den, (47) 


showing that the system has eigenstates W’ of the total 
Hamiltonian 3, when y; and y; fulfill the relations 


Ques’ =e; 7, Deri= ei. (48) 
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For the excitation energy, we get therefore 


{Hop’)w— (Hop) w= €j— €i, (49) 
giving a rather visual meaning to the spin-orbital 
energies ¢,;’ and e;. This theorem was in (II, 135) proved 
for single determinants, i.e., for the average energies 
of a series of multiplets, but it is here extended to be 
valid even for the energies of pure spin states built up 
from orthogonal orbitals. This result gives a preliminary 
solution of a problem in the molecular theory which 
has been discussed rather frequently in the literature." 
We note that, if the basic orbitals are partly over- 
lapping, the simple result (49) must be somewhat 
modified. 

Let us now consider the total wave functions ¥’ and 
V given by (45). If the determinant YW for the “ground 
state” is built up from the ordinary eigenfunctions to 
the operator Qer, then the determinant Wo’ for the 
“excited state” is obtained from Wo by replacing the 
column containing the spin-orbital ¥; by a column con- 
taining the spin-orbital y,’. According to (II, 136), the 
spin-orbitals y; and y,’ are strictly orthogonal, and it 
may then be shown that, for rather general forms of the 
projection operator © (as for ordinary and isotopic 
spins), also the total wave functions W and W’ fulfill the 
necessary orthogonality condition. 

The essential problem in treating the excited states 
is to construct the corresponding effective operator 
Qe’ and to solve the nonlinear problem of finding its 
eigenfunctions y,’ and eigenvalues e,’. For a discussion 
of this problem, the reader is referred to Part II. 

The projection operator formalism renders also a 
simple way of calculating transition moments, for, if 
D=e >; r; is the operator of electric moment, we have 


f W* DY (dx) = f Vo'*(otDo)Wo(dx), (50) 


which expression may be expanded analogously to (41). 

We have here treated the excited states before the 
ionized states, since, in the excitations, the total 
number of particles is kept constant, which is of im- 
portance for having a fixed form of the projection 
operator © in (45). However, since an ionization may be 
considered as the limiting case of an excitation to a 
spin-orbital y,’ at infinity with ¢;/=0, we obtain from 


(49) 
(51) 


showing that (—e,) measures the ionization energy. 
This is an extension of Koopmans’ theorem” to de- 
generate systems built up from orthogonal orbitals. 


(Hop! )aw— (op) v= = €%, 


"R. §. Mulliken, J. Chem. Phys. 46, 497, 675 (1949); C. C. J. 
Roothaan, Revs. Modern Phys. 23, 69 (1951), p. 80. 
"See Part II, references 30 and 31. 
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3. INCLUSION OF CORRELATION EFFECTS 


One of the strongest arguments against the validity 
of the ordinary Hartree-Fock scheme is that it does not 
treat the “correlation” between particles of different 
spin types in a proper way, and we will now take up 
this problem to discussion. 

The basic idea of the “‘independent-particle model”’ 
is that, in a first approximation, one can neglect the 
mutual interaction between the N particles in the 
system in constructing the total wave function, which 
then takes the simple product form 


¥1(X1)2(X2)- > Ww (xy), 


where y, (k=1, 2, ---N) is a set of N spin-orbitals 
determined essentially by the outer framework. How- 
ever, between the particles i and j, there is in reality a 
potential 3C;; which, particularly for small distances 
r,;~0, may be tremendously large. If this potential is 
repulsive, like the Coulomb potential 3¢;;=e?/r;;, it tries 
naturally to keep the particles apart," and, since this 
“correlation” between the movements of the particles 
is entirely neglected in forming (34), the corresponding 
energy is affected by an error which is usually called the 
“correlation energy.” 

The situation is somewhat changed by the antisym- 
metrization procedure, which transforms the product 
function (52) into a single Slater determinant. In Part I, 
we have shown that, for every antisymmetric wave 
function, the second order density matrix (I, 3) 
I'(x;’X2’|X1X2) is also antisymmetric in each set of its 
indices, and this implies that, if two indices in a set are 
the same (x;’= xs’ or x}= Xe), the corresponding element 
will vanish identically. For the diagonal element, we 
obtain in particular 


(52) 


T'(x1Xe| X1X2)=0, for X1>= Xe (53) 
showing that the probability density for two particles 
with the same spin to be in the same place is zero of at 
least the second order (the “Fermi hole’’). This means 
that the antisymmetry itself acts as if there would be a 
rather strong repulsion’ between particles with the 
same spin at small distances, and this consequence of 
the Pauli principle automatically diminishes the error 
due to the neglect of the 3C,,-correlation. The exchange 
energy will therefore take care of a rather large part of 
the original correlation energy, referring to particles 
with parallel spins. The corresponding effect of the 
antisymmetrization on the particle distribution has 
been investigated by Lennard-Jones.’ 

18 Compare also some recent results for nucleons by M. Levy 
and others. 

“In analogy to (37) and (38), we may interpret the effect of 


the antisymmetrization as if we still considered simple product 
functions (52) but a “composite” Hamiltonian of the form 


Qop= HopA, A= (N!)7 Zp (- 1)?P, 
where A is the antisymmetrization operator obtained by summing 


over all permutations P having the parity p. 
16 J. E. Lennard-Jones, J. Chem. Phys. 20, 1024 (1952). 
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The main problem is apparently to take the corre- 
lation between particles having different spins into 
proper account, and a first estimate of this effect was 
given by Wigner.'* His preliminary results seems to be 
confirmed by the recent work by Bohm and Pines" 
using the “plasma” model. Here we will instead use 
another approach, which is based on our extension of 
the ordinary Hartree-Fock method. 

The importance of the Hartree-Fock scheme depends 
on its connection with the “independent-particle model”’ 
giving it a physical visuality, which is useful in the 
interpretations and in constructing the ionized and 
excited states. In the previous section, we have shown 
that this visuality is preserved also in our extended 
scheme, where the wave function ¥ is a projection of a 
single determinant Wo: 


V=0V. (55) 


The basic determinant is here built up from spin- 
orbitals obtained from two more or less independent 
groups of orbitals by multiplying them by the spin 
functions a and 8, respectively. We note that we have 
already made us free from the idea of “doubly filled 
orbitals,” and that this distinction was of importance 
for transforming the Hartree-Fock equations (42) to 
the eigenvalue problem form (44). This new degree of 
freedom may now also be used for including correlation 
effects, since we may choose the two sets of orbitals 
associated with different spin functions in such a way 
that they let particles with different spins try to avoid 
each other.!* In fact, there is no new basic assumption 
needed for including correlation in our extended Har- 
tree-Fock scheme, since the best spin-orbitals are always 
represented by the eigenfunctions to Qers. 

As a first example, we may consider the two-electron 
problem and its applications to the helium atom, the 
hydrogen molecule, and the z electrons of ethylene. By 
starting from ‘wo basic orbitals u(r) and v(r), we may 
construct a total wave function Y= O{ ua,v8}, which, for 
the singlet state, is reduced to the form 


W=const.{u(1r)0(re)+u(re)v(1r1)} 
X {a(1)8(2)—a(2)8(1)}. 


(56) 


This is one of the exceptional cases, where it is possible 
to separate the orbital part and the spin part of the 


16E. P. Wigner, Phys. Rev. 46, 1002 (1934); Trans. Faraday 
Soc. 34, 678 (1938). 

17D. Bohm and D. Pines, Phys. Rev. 82, 625 (1951); 85, 338 
(1952); 92, 609 (1953); D. Pines, Phys. Rev. 92, 626 (1953); 
Proc. 10th Solvay Conference (1953) (to be published). 

18 The possibility of having different orbitals for different spins 
was first mentioned by Hartree and Hartree in reference 10 in 
connection with the diagonalization of the matrix of Lagrangian 
multipliers, but it was never used by them. The importance of 
this possibility for the proper description of ferromagnetic and 
antiferromagnetic materials has several times been pointed out by 
J. C. Slater, Phys. Rev. 82, 538 (1951). However, as far as we 
know, it has not been explicitly pointed out in the literature that 
this new degree of freedom may be used for including correlation 
effects in a simple way; compare reference 24. 
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total wave function into two factors. The orbitals 4 
and v are neither identical nor orthogonal, and their 
best form is automatically found by solving the ex- 
tended Hartree-Fock equations (44) for the special 
case N=2, 


Correlation Effects in the Helium Atom 


We note that, without further calculations, we may 
say a few words about the results we may expect, due 
to the connection between our method and previous 
investigations treating the two-electron problem from 
other points of view. Let us start by considering the 
ground state of the helium atom, having an energy ex- 
perimentally determined to 2.9032 atomic units 
(e?/ap=2 Ry). The symbol (1s)? indicates an approxi- 
mation with «=v, and Kellner’® has shown that, if this 
orbital is approximated by a single exponential, the best 
result is obtained for an effective nuclear charge 
Z=1.6875 giving a total energy of 2.8476. If u=v is 
represented by the best Hartree-function,” the energy 
value is improved to 2.8615. The symbol (1s’, 1s”) 
would indicate an approximation, with uv, and 
Eckart”! has shown that, if « and v are approximated by 
two exponentials, the best result will be obtained for 
Z,=1.19 and Z,= 2.184, giving a total energy of 2.8756, 
ie., a result considerably better than the Hartree- 
approximation. Eckart’s simple result is of great inter- 
est to us, since it indicates that we may expect con- 
siderable improvements of the present Hartree-Fock 
scheme for the atoms of the periodic system by making 
us free from the idea of doubly filled orbitals. 

Eckart’s result on helium has recently been improved 
by Taylor and Parr” by using a method of configura- 
tional interaction based on a series of determinants of 
exponential functions of s-, p-, d-, and f-type. Their 
results have been analyzed by Lennard-Jones* who has 
shown that the spatial correlation in helium may occur 
in two ways: as an “in-out” effect, with one electron 
tending to be outside the other, and as an “angular” 
effect with the two electrons tending to be on opposite 
sides of the nucleus. However, Taylor and Parr pointed 
out that, even if they obtain about 97 percent of the 
angular correction, their method of ordinary configura- 
tional interaction between s-functions showed a slow 
convergency with respect to the radial correlation. This 
phenomenon is also reflected by the fact that their 


19 G. Kellner, Z. Physik 44, 91 (1927). 

*”D. R. Hartree, Proc. Cambridge Phil. Soc. 24, 111 (1928); 
for the energy value, see H. Bethe, Handbuch der Physik (Julius 
Springer, Berlin, 1933), Vol. 24, No. 1, p. 370. See also W. S. 

ilson, Phys. Rev. 48, 536 (1935). 

21 C, Eckart, Phys. Rev. 36, 878 (1930). It seems to be less well 
known that this result was already obtained by E. H. Hylleraas 
in his pioneer work, Z. Physik 54, 347 (1929). The author 's 
indebted to Dr. Harrison Shull for some discussions of Hylleraas 


paper. 
2G. R. Taylor and R. G. Parr, Proc. Nat. Acad. Sci. U. S. 38, 
154 (1952). 
% J. E. Lennard-Jones, Phil. Mag. 43, 581 (1952). 
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best wave. function contained only 64 percent of the 
radial correlation energy. 

The helium problem was discussed at the Shelter 
Island Conference in 1951, and the situation was 
analyzed by Mulliken,™ denoting (1s)? and (1s’, 1s’’) 
as closed-shell and open-shell electron configurations. 
It is possible that the extended Hartree-Fock scheme 
proposed here is the “generalization of the SCF pro- 
cedure” required by Mulliken. In all events, it will be 
interesting to see how good approximation of the 
energy one can obtain by solving the extended Hartree- 
equations (44) for VN=2 and uv with the normaliza- 
tion integral taken into account, and how closely one 
can approach Hylleraas’ classical result in the unrela- 
tivistic approximation. In a first approximation, we 
will assume that « and » are s-functions depending 
only on the distance to the nucleus, which will de- 
scribe the main part of the “in-out” effect. In the 
next step, # and » may depend on the angles, but, 
in such case, we must introduce a projection operator 
containing also the total angular momentum operator 
in order to select an S state for the total wave function. 
Numerical calculations on the helium problem along 
these lines are now in progress. 


Correlation Effects for Diatomic Molecules 


The wave function (56) may also be applied to the 
hydrogen molecule problem. It would probably be very 
hard to solve the exact extended Hartree-Fock equa- 
tions (44) for such a problem, but significative results 
could be obtained by using the variation principle. In 
such a case, it is therefore important to know something 
a priort about the general form of the orbitals involved, 
and, according to the previously mentioned “correlation 
principle,” we will assume that « and v may be of such 
a type that the electrons in them (having different 
spins) tend to avoid each other. This is mainly estab- 
lished by two effects: the “alternant” effect, trying to 
keep the two electrons on separate atoms, and the 
“in-out” effect, trying to keep one electron outside the 
other, when they happen to be on the same atom. The 
form of these orbitals is indicated in Fig. 1. Semilocalized 
molecular orbitals were first constructed for the hy- 
drogen molecule by Coulson and Fischer,?* who pointed 
out that they lead to the same result as the ordinary 
method of configurational interaction using two deter- 
minants. 

As far as we know, the “in-out” effect has not pre- 
viously been used in the theory of molecules or crystals. 
It seems to be rather important, since the Coulomb 
integrals associated with two electrons concentrated in 


*4R.S. Mulliken, Proc. Nat. Acad. Sci. U. S. 38, 160 (1952). 
'C. A. Coulson and I. Fischer, Phil. Mag. 40, 386 (1949). The 
“best orbital” problem for the hydrogen molecule was formulated 
‘na complete form by M. Kotani, Proceedings of the Shelter 
Island Conference on Quantum Mechanical Methods in Valence 
Theory, 139 (1951), where he also discusses the solution in 
tlliptical coordinates. See also f . Lennard-Jones and J. A. Pople, 
Proc. Roy. Soc. (London) A210, 190 (1951). 
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Fic. 1. The two orbitals u and v showing “alternant” and “in-out” 
effect for a two-electron problem in a diatomic molecule. 


the same orbital on one atom are certainly too large, 
because of the neglect of the electronic correlation. The 
problem of these “ionic” Coulomb integrals have been 
particularly emphasized by Moffitt?’ in treating the 
oxygen molecule and by Pariser and Parr?’ in inves- 
tigating some conjugated organic compounds, and these 
authors proposed that the values of the ionic integrals 
should be corrected by comparison with experimental 
data. Sponer** and the present author found similarly 
that, in a -electron theory of ethylene based on the 
atomic Hartree-Fock functions for carbon, the singlet- 
triplet separation came out much too large, and that the 
error could be localized mainly to the ionic (x1|x7)- 
integral.® By introducing the “in-out” effect for the 
electrons condensed on the same atom, the value of the 
ionic Coulomb integrals will now be essentially dimin- 
ished, and we note that this correction may be carried 
out in a purely theoretical way by using the variational 
principle for the total energy. Numerical applications 
to the hydrogen molecule and to ethylene are now being 
prepared. 


4. THE METHOD OF ALTERNANT MOLECULAR 
ORBITALS 

In a theory of molecules and crystals, where the 
total wave function is approximated by a single deter- 
minant constructed from molecular spin-orbitals, there 
is a certain difficulty connected with the fact that the 
cohesive energy shows a wrong asymptotic behavior 
for separated atoms.” This depends on the fact that 
such a wave function permits electrons of different 
spins to accumulate on the same atom and give rise to 
negative and positive ions, having higher energy 
together than the ordinary dissociation products; see 
Fig. 2. One way of removing this defect is by con- 
figurational interaction, but, except for the simplest 


26 W. Moffitt, Proc. Roy. Soc. (London) A210, 224, 245 (1951). 

27R., Pariser, J. Chem. Phys. 21, 568 L (1953); R. Pariser and 
R. G. Parr, J. Chem. Phys. 21, 767 (1953). 

954) Sponer and P. O. Léwdin, J. phys. radium 15, 607 
1954). 

29 See also the remarks by R. S. Mulliken and P. O. Léwdin at 
the Nikko symposium 1953, Proceedings of the Japanese Con- 
ference on Theoretical Physics, 1953 (to be published). 

% See, e.g., J. H. Van Vleck and A. Sherman, Revs. Modern 
Phys. 7, 167 (1930), p. 170; and J. C. Slater, Quarterly Progress 
Report of Solid-State and Molecular Theory Group, M.L.T., p. 26, 
January 15, 1952 (unpublished). 
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Fic. 2. The energy E as a function of interatomic distance R. 
The upper curve refers to a single determinant, and the lower curve 
represents the correct behavior. 


molecules, this approach leads usually to secular equa- 
tions of such a high order that they are extremely hard 
to solve. 

Fortunately, there seems to be also another possi- 
bility for solving this problem. Slater* has several times 
pointed out that, in an antiferromagnetic material, 
there must be a tendency for a certain spin alignment 
with the crystal divided into two sublattices with the 
valence electrons having either plus or minus spin. 
Slater is accordingly looking for a crystal theory which 
would be similar to a valence bond method for separated 
atoms and similar to a molecular orbital method for 
small and intermediate distances between the atoms. 
The essential point is apparently to find a modification 
of the ordinary molecular orbital method which, for 
separated atoms, automatically would lead to a spin 
alignment of the type proposedby Slater, for then there 
would be no possibility for excessive occurrence of ions. 
However, in order to avoid a real antiferromagnetic 
behavior of the system, the total wave function must 
be invariant with respect to an interchange of the two 
spins. 

An attempt to translate these ideas into mathematical 
form has been made by the author® by using the 
method of “alternant molecular orbitals,” and we will 
here give a short survey of its main result in order to 
discuss its connection with the general theory developed 
in this paper. 

Let us say that, by solving the Hartree-Fock equa- 
tions by, e.g., the MO-LCAO method, we have found a 
set of MO’s for the valence electrons belonging to the 
system. In the naive MO theory, the orbitals for the 
valence electrons are only partly filled. By using all the 
MO’s available, we may now try to construct combina- 
tions which tend to be localized on two interpenetrating 
subsystems, I and II, for separated atoms. For the sake 


31 J. C. Slater, Phys. Rev. 35, 509 (1930), see p. 527; Proceedings 
of the Shelter Island Conference on Quantum Mechanical Methods 
in Nalence Theory, 121 (1951); Phys. Rev. 82, 538 (1951). 

. O. Léwdin, Proceedings of the Japanese Conference on 
Theacetical Physics, 1953, Nikko Symposium (to be published). 
It should be noted that our method is applicable to both mobile 
and localized electrons. Systems containing only localized single 
bonds have also been treated by L. A. Schmid, Phys. Rev. 92, 
1373 (1953) and Hurley, Lennard-Jones, and Pople, Proc. Roy. 
Soc. (London) A220, 446 (1953). 


of simplicity, let us consider an alternant system, for 
instance a crystal constituted of two equivalent sub- 
lattices I and II, as the body-centered cubic structure, 
or an alternant hydrocarbon,* where the atom, if one 
moves along a chain of unsaturated carbon atoms, 
belong alternately to set I and to set II. 

As in Part II, Sec. 3 (b), we let ¢, be the ordinary or 
hybridized atomic orbitals associated with the system, 
and y, the corresponding set of ON-AO’s. It is a 
characteristic feature of the alternant systems that the 
MO’s occur in pairs, 7’ and j’’, with orbital energies ¢; 
and ¢; belonging to symmetric places in the lower and , 
the upper half of the “energy band,” respectively. The 
excited orbital y;-, is obtained from the lower orbital y; 
by changing the sign of the coefficients of the AO’s of 
one of the subsystems, let us say II: 


I II 
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Let us then form the combinations 


I II 
Var=apythy, = (a+b)>~ Puls (a—b)>> Puli, 
B B 


(58) 
I II 
Vir = ap jr — bp = (a— bd) Guus t+ (a+b)>) ouCu;. 


Since the normalization condition takes the form 
a’?+6?=1, we may put a=cos@ and b=sin@ and describe 
the mixing between the MO’s by an angle 6. We note 
the special cases: 


6=0°, ordinary lower half MO’s; 
6=45°, purely alternant MO’s; 
6=90°, ordinary upper half MO’s. 

The MO’s belonging to the lower half of the “energy 
band” are bonding orbitals, whereas the MO’s belonging 
to the upper half are antibonding.* For 0<0<90°, the 
orbitals yj are semilocalized on system I and the 
orbitals Yj11 on system II, and we will therefore call 
them alternant molecular orbitals. For 0=45°, this 
localization is complete. We note further that orbitals 
belonging to different indices j are still orthogonal, 
whereas 


A= f vj0b rdx=cost0. (59) 


As a simple example, we may consider the lowest or- 
bitals for a linear chain; see Fig. 3. 

In order to construct a wave function for the system, 
which leads to the correct asymptotic behavior of the 


%C, A. Coulson and H. C. Longuet-Higgins, Proc. Roy. So. 
(London) A192, 16 (1948). 
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energy for separated atoms, we will now consider a 
Slater determinant 


ol a0 
o={ 188° }, (60) 


where we have used the notation (11); each orbital of 
type I is therefore occupied by an electron with plus 
spin and each orbital of type II by an electron with 
minus spin. The various spin multiplets may then be 
obtained by using the projection operator (7): 


V= AH OT,, (61) 
If, for separated atoms, we let 6 tend to the value @= 45°, 
there will be a spin alinement of the type proposed by 
Slater, and the wave function (60) will then have the 
correct asymptotic behavior. We will now check that 
the various spin multiplets of the wave function (60) 
also have preserved this property. For 6=45°, the dis- 
cussion is simplified by the fact that all the alternant 
MO’s become strictly orthogonal, see (59), and the 
energy is then given by formula (25) : 


(Hop)av - f To* Hop T (dx) 


L(l+1)—n a 

—~——— ¥ (jL All| x.|AIL,j1). (62) 
n ik=l 

Since the exchange integrals in the last term in the 

right-hand member tend to zero for = 45° and separated 

atoms, our theorem is proved. 

For 6=0 and /=0, the function (61) reduces to the 
well-known single-determinant wave function of the 
ordinary MO theory with the bonding orbitals doubly 
occupied. This means that, by varying 6, we may 
obtain a depression of the energy curve in Fig. 2 also 
for intermediate distances, and, in particular cases, the 
improvement of the energy minimum may be appre- 
ciable. The physical interpretation of this procedure 
will be discussed in the next section. The energy ex- 
pression in the general case (645°) is somewhat more 
involved than (62), due to the occurrence of the non- 
orthogonality integral \=cos2@ in (59), but it may be 
derived by using (23), (24), and (I, 49). The mathe- 
matical details will be confined for a following paper. 

Here we would only like to mention that, in an appli- 
cation of the alternant MO method to the ground state 
of the benzene molecule, Itoh and Yoshizumi*! have 
obtained a depression of the energy minimum for 6~ 23° 
of about 2.35 ev, which amounts to 85 percent of the 
depression obtained by Parr, Craig, and Ross** by 
using a method of configurational interaction containing 
nine determinants; an expansion of the wave function 


_*T. Itoh and H. Yoshizumi, J. Phys. Soc. Japan (to be pub- 
lished). The author is greatly indebted to them for kindly in- 
forming me about their results before publication. 

* Parr, Craig, and Ross, J. Chem. Phys. 18, 1561 (1950). 
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Fic. 3. Ordinary and alternant MO’s. 


(61) shows that there is also a fairly good resemblance 
between the two total wave functions found in so dif- 
ferent ways. 


5. GENERAL THEORY OF SPIN ALIGNMENT 
IN MOLECULES AND CRYSTALS 

The principal investigation of “correlation effects,” 
carried out in this paper, makes it now possible for us 
to build up the method of alternant MO’s on a more 
general basis. It is clear that the erroneous asymptotic 
behavior of the upper energy curve in Fig. 2 depends 
mainly on the neglect of the Coulomb correlation 
between electrons with different spins, since it is just 
the Coulomb repulsion which prevents the excessive 
formation of negative ions with the electrons too closely 
condensed on the same atom. The solution provided by 
the method of alternant MO’s must therefore in some 
way take this correlation into account, and, if we 
compare the form of the orbitals in Figs. 1 and 3, we 
find a striking similarity, and we see that the alternant 
MO’s may be considered as being constructed according 
to the “correlation principle” so that electrons having 
different spins tend to avoid each other due to their 
mutual repulsion. 

However, once we have put the method of alternant 
MO’s in connection with our general theory, we see 
immediately that it may be improved in several ways, 
for instance by including the “in-out” effect for elec- 
trons which happen to be on the same atom. This can 
be performed by constructing the alternant MO’s from 
two sets of atomic orbitals corresponding to different 
effective nuclear charges,’ and the inclusion of this 
effect is certainly necessary in order to obtain good 
results in investigations of the ground state and lower 
excited states of, e.g., the conjugated systems.?’ 

The theory of alternant MO’s was originally built up 
on Slater’s idea of the existence of a certain spin aline- 
ment in alternant systems, but we can now also make 
us free from this assumption as being unnecessary. The 
ordinary MO theory is firmly built on the idea of 
“doubly occupied orbitals,” but, as soon as we have 
made us free from this restriction, there are two groups 
of orbitals associated with different spins at our dis- 
posal, and the “correlation” between them is auto- 
matically determined by the extended Hartree-Fock 
equations (44) and the composite effective Hamiltonian 
Qers given by (38), (39), and (43), where, of course, 
we now have to take the total normalization integral 
given by (37) into full account. However, even if the 
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form of the two groups of orbitals is fully determined by 
the eigenvalue problem (44), it is certainly extremely 
hard to solve this problem exactly for a molecule or 
crystal, and some ideas of the qualitative form of these 
orbitals will then be very useful in connection with the 
approximate solution of (44) by means of the variation 
principle. 

This result is of importance, since it tells us that the 
extended Hartree-Fock equations (44) will give the 
answer to the question of the existence of a general spin 
arrangement in molecules and crystals. It gives a pos- 
sibility for investigating the difference between the spin 
alignments in body-centered and face-centered cubic 
structures, and many other problems. 


6. CONCLUSIONS 


By using the idea of projection operators and wave 
functions being projections of single determinants, we 
have here given an extension of the ordinary Hartree- 
Fock scheme to include the treatment of degeneracies 
and correlation effects. From the very beginning, we 
have made us free from the idea of “doubly occupied”’ 
orbitals, and the two more or less independent groups 
of orbitals associated with different spins are then de- 
termined by extended Hartree-Fock equations, where 
the effective Hamiltonian is derived from a composite 
total Hamiltonian containing also the projection op- 
erator for the state under consideration. This effective 
Hamiltonian contains also many-particle interactions, 
but, otherwise, the extended scheme has preserved the 
simplicity and physical visuality characteristic for the 
theory based on a single Slater determinant. The two 
groups of orbitals are of such a type that particles having 
different spins tend to avoid each other, and the ex- 
tended Hartree-Fock equations give therefore also a de- 
scription of spin alignments in molecules and crystals. 

In order to apply the theory developed in principle 
in this paper to practical problems, it is desirable to 
know also the reduced form of the basic energy (41) 
expressed only in terms of the two groups of basic 
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orbitals, and the mathematical details of this problem 
will be treated in a forthcoming paper. Here we will 
only point out that, even if we have illustrated our 
extension of the “independent-particle model” by 
examples from the theory of electronic structure of 
atoms, molecules, and crystals, the general scheme may 
just as well be applied to nuclear theory after inclusion 
of the isotopic spin. 
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New analytical, x-ray diffraction, precision density, and electrical resistivity studies of boron-doped 
silicon show that boron resides in the silicon lattice substitutionally. 





INTRODUCTION 


EARSON and Bardeen! and more recently Morin 

and Maita? have investigated the electrical 
properties of silicon containing boron. Pearson and 
Bardeen concluded from electrical and lattice parameter 
studies that boron occupies substitutional positions in 
the silicon lattice. In the present investigation density 
measurements in addition to lattice parameters, 
electrical, and chemical analysis, confirm the location 
of boron as substitutional in silicon up to concentrations 
of 0.3 atom percent. 


PREPARATION OF SAMPLES 


There is no ready source of boron of a purity com- 
parable to that of silicon currently available. Most 
elements have a smaller segregation coefficient (C;/C1) 
than boron in silicon.*4 It is possible to make use of this 
difference to remove impurities introduced with the 
boron in silicon itself. Thus, if boron with impurities 
that segregate during crystallization of silicon is added 
to silicon which is then zone-melted,® the first frozen 
region of the zone-refined ingot will still contain boron 
but, in accordance with their segregation constants, 
the impurities will be swept out to crystallize toward 
the sprout end. 

For these studies, approximately one atom percent 
of Norton “spec. fine” boron was added to zone- 
refined DuPont hyperpure silicon. This solution was 
zone-refined, five zones being used. Weighed portions of 
about the second 40 percent of the zone-refined ingot 
were added to known weights of zone-refined silicon 
and single crystals grown. Slices of single crystal were 
taken from as nearly the same region of each crystal 
as possible in order to minimize the effects of segre- 
gation of boron down the length of a crystal. Where 
chemical analyses for boron were made, the pieces 
were taken adjacent to the slices retained for study. 
The samples prepared and their boron content are 
listed in Columns I and II of Table I.* 


'G. L. Pearson and J. Bardeen, Phys. Rev. 75, 865 (1949). 

°F. J. Morin and J. P. Maita, Phys. Rev. 96, 28 (1954). 

*R. N. Hall, J. Phys. Chem. 57, 836 (1953). Note: Gallium 
should be 0.004. 

‘E. A. Taft and F. H. Horn, Phys. Rev. 93, 64 (1954). 

‘W. G, Pfann, J. Metals 80, 747 (1952). 

‘IT am indebted to W. W. Welbon of the Analytical Section, 

emistry Department for these boron analyses. The method 
used was worked out by Dr. E. H. Winslow of this laboratory. 


PRECISION DENSITY MEASUREMENTS 


In order to establish the location of boron in a 
silicon crystal the x-ray diffraction data require con- 
firmation by density measurements. Since only very 
small concentrations of boron are necessary to alter the 
electrical properties of silicon greatly, it was considered 
desirable to use a method that would measure ex- 
tremely small density changes with high precision. An 
adaptation of a “density gradient” method developed 
by Linderstrom-Lang’ was tested. and used in these 
studies. Details for the preparation, use, and accuracy 
of the method are discussed in the Appendix. The 
methods used in this study were accurate to 10-* g/cc 
while the sensitivity was about 3X10~7 g/cc. 


DENSITY OF SILICON 


As a check on the operation of the density columns, 
the density of a 40 ohm-cm piece of silicon (DuPont, 
zone refined and grown single crystal) was determined. 
The results, taken from measurements that varied 
2X 10~ over 48 hours are given in Table II. 

Other previous calculations or measurements have 
been included. Since the time of these measurements 
we have found a variation in the density of high- 
resistivity single-crystal silicon of the order of 20 10-° 
g/cc. We have thus rounded the value reported to 
2.3306’ (25.3°C). This value compares favorably with 
the density 2.3305 calculated from the x-ray diffraction 
do value determined on the same material. 


TABLE I. Increase in density and lattice 
contraction from boron in silicon. 








g/cc X10 
greater than Si 


Boron atom 


Crystal percent added ainA 





1540> 
315 
189 


RR-55 
RR-S4 
RR-53 
RR-S2 
RR-S1 
RR-SO 
RR-49 


0.314416 
0.104+17.5* 
0.052 

0.021 

0.0021 
0.00021 
0.000021 
Control 


5.4249+0.0010 
5.4281+0.0008 
5.4282+0.0004 
5.4291+0.0004 


5.4295+0.0005 








® By chemical analysis: others by aliquot taken. 
b Measured pycnometrically. 


7K. Linderstrom-Lang and H. Lang, Jr., Compt. rend. trav. 
lab. Carlsberg 21, 315 (1938); Linderstrom-Lang, Jacobsen, 
Johannsen, Compt. rend. trav. lab. Carlsberg 23, 17-24 (1941). 
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TABLE II. Lattice parameter and density of silicon. 








Worker Year Purity ainA d g/cc 





Lipson and 
Rogers* 

Straumanis 
and Aka® 


—— 


Ao 
This study 


2.328% 
2.3283(20°C) 


1944 
1952 


99.85% 5.4308 


0.01 ohm-cm_ 5.431 
+0.03% 


1951 
1953 
1954 


ca40ohm-cm 5.4295 +0.0005 


ca 40 ohm-cm 


2.331 
2.3305 (25°C) 
2.33067 








® H. Lipson and L. E. R. Rogers, Phil. Mag. 35, 544 (1944). 

> Calculated by using Nay(chem) =6.02402 X10%, A,/A» =1.00203, from 
J. A. Bearden and H. M. Watts, Phys. aa 81, 73, 160 (1951); and Msi 
=28.09—Chem. Eng. News, ”. Pwo (1951). 

© M. E. Straumanis and E. Z. Aka, J. Appl. Phys. 23, 330 (1952). 

4 McSkimmin, Bond, Buehler, and Teal, Phys. Rev. 83, 1080 (1951). 

eI am indebted to Mrs. B. Decker of this iaboratory for these data, 
the average of seven measurements. 


DENSITY OF BORON-DOPED SILICON 


Spheres of boron-doped silicon were placed in a 
prepared density-gradient column together with the 
glass calibration spheres. After allowing a day for 
equilibration, the position of the spheres was read and 
by interpolation from the position of the calibration 
spheres translated into differences in density referred to 
40 ohm-cm resistivity silicon. These values are reported 
in Table I, Column III. 

It will be noted that the density increases with 
increasing boron concentration in the silicon. This was 
contrary to an expectation based on the assumption 
that the density change is the additive effect of the 
increase in the sensity caused by lattice contraction 
and the decrease in density caused by substituting the 
lighter boron atoms for the silicon. From Pearson and 
Bardeen’s! data, the lattice contraction amounts to 
0.115 percent per atom percent boron, or an increase 
of 0.345 percent in density. The density decrease due 
to a substitution of one atom percent boron for silicon 
is readily calculated to be 0.64 percent. From these 
considerations we had anticipated a decrease in density 
of the order 0.3 percent from boron doping. The 
observed increase in density amounted to about 0.2 
percent. This increase could be explained if the boron 
atoms do not enter the silicon crystal entirely by 
substitution—viz., as a precipitate or interstitially. 


LATTICE CONSTANT 





1 








> az 
ATOM % BORON Ih SILICON 


Fic. 1. Lattice constant for silicon as a function 
of boron concentration. 
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It seemed more reasonable, particularly because the 
electrical data of Pearson and Bardeen was in such 
good agreement with their assumption of substitutional 
boron, that the explanation might rather be found in 
some uncertainty of the lattice contraction data given 
by these authors. This seemed the more likely since 
Pearson and Bardeen were obliged to use polycrystalline 
silicon. Their composition scale could thus be in error 
due to boron rejection at grain boundaries. Such 
considerations led us to a redetermination of the lattice 
contraction due to boron. 

The lattice parameters for the boron-doped silicon 
samples® are reported in Table I, Column IV. These 
values are compared with the results of Pearson and 
Bardeen in Fig. 1. It is observed that the lattice 
contraction determined in this study is considerably 
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Fic. 2. Measured increase in density of silicon as a function of 
boron concentration compared with sum of density changes due 
to lattice contraction and boron substitution. 


greater than that formerly reported. The contraction 
amounts to 0.86 percent in density in silicon as com- 
pared with 0.35 percent reported by Pearson and 
Bardeen. 

The effect on the calculated density from our lattice 
contraction data is shown in Fig. 2. The lower curve is 
for the decrease in density calculated for the effect of 
boron substitution only. The sum of this curve and the 
density increase due to lattice contraction has also been 
indicated. It is seen that within the accuracy of the 
determination of the boron composition the measured 
densities agree well with the calculated net density 
change based on the assumption that this change 


8 I am indebted to Mrs. A. Cooper of the Metallurgy Depart: 
ment for these x-ray diffraction determinations. 
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Fic. 3. Total mobility as a function of resistivity for boron-doped silicon at room temperature. 


the sum of the change due to lattice contraction and 
the change due to substituting light boron atoms for 
silicon. This may be taken as reassurance that boron 
does, in fact, enter the silicon lattice substitutionally. 

If one takes the equation for the net density change 
and calculates the anticipated change in density for 
lightly boron-doped silicon (points not included because 
of the scale used), it is found that the measured density 
increase is greater than anticipated. It is now recognized, 
however, that the data for these very lightly doped 
samples cannot be used in the argument since the 
density differences measured are of the same order or 
smaller than the variation (20X10-® density units) 
found between different crystals of silicon of presumably 
similar properties. Thus, the sensitivity of the method 
has been greater than necessary because of larger 
variations which were not controlled here. However, the 
fact that such readily measurable differences in density 
exist has naturally suggested further investigation. 


ELECTRICAL PROPERTIES OF BORON-DOPED 
SILICON 


The resistivity of bars of boron-doped silicon is 
given in Table III. The resistivity was measured 
potentiometrically using a moving probe. The linearity 
of the recorded traces of the potential drop as a function 
of probe position along the bars indicates good uniform- 
ity. The bars were cut transverse to the direction of 
crystal growth. 

The number of silicon atoms/cc in pure silicon is 
readily calculated from the density as (2.3306/28.09) 
X6.024X 10% or 5X10” atoms/cc. Knowing that 
boron is substitutional in silicon, the number of boron 
atoms/cc is readily obtained from the composition 
given in atom percent. These numbers also appear in 
Table III. A mobility for holes may be calculated 
from the resistivity and number of carriers, V,, assuming 
one carrier for each boron atom, since n=1/Nep. In 


Fig. 3 these calculated room temperature mobilities 
are compared with the total mobility curves used and 
tested by Debye and Kohane.? This confirms the 
assumption of Pearson and Bardeen! that each sub- 
stituted boron atom accepts an electron. 

From the room temperature data it is not possible to 
determine the transition from semiconduction to 
metallic conduction. More information on the electrical 
properties is obtained from Hall coefficient vs tempera- 
ture data. 


SUMMARY 


(1) The density of single-crystal silicon has been 
determined by a gradient density column technique as 
2.33067 (25.3°C), a value found to agree with the density 
calculated from the x-ray determined lattice parameter 
5.4295+0.0005A. 

(2) The lattice contraction and density of boron- 
doped silicon of chemically determined compositions 
was measured. The data are analyzed to show that 
boron enters the silicon crystal substitutionally. 

(3) The above conclusions rest on the assumption 
that in a distorted crystal the number of atoms/cc 


TABLE III. Room temperature electrical data for 
boron-doped silicon. 








Atom 
percent 
boron 


uem?/volt 


p ohm-cm> sec calc. 


8.7X10~ 
1.7X10% 
3.5X 10-3 
7.6X 10 
40X10 
2.0107 
1.55 


Carriers/cc* 


1.55X 10” 
5.2 X10" 
2.6 X10" 
1.0510" 
1.0°X 108 
1.05 10" 
1.05 10'* 


Sample 


RR-55 
RR-54 
RR-53 
RR-52 
RR-51 
RR-S0 
RR-49 





0.314 
0.104 
0.05? 
0.02! 
0.002! 
0.0002! 
0.00002! 








® Calculated assuming one carrier per boron atom. 
b Measured at 25°C. 


9p, P. Debye and T. Kohane, Phys. Rev. 94, 724 (1954). 
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Fic. 4. Density gradient column and principle of interpolated 
density of unknown from bodies of calibrated density. 
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determined by x-rays is the same as the number of 
atoms/cc obtained from the density. This situation has 
been discussed by Eshelby.” We may alternatively 
assume in the present work that boron is substitutional 
and derive evidence for the validity of Eshelby’s 
conclusions. 

(4) The electrical resistivity at room temperature 
of boron-doped silicon shows each boron atom an 
acceptor of one electron and leads to hole mobilities 
consistent with previous theoretical and experimental 
results as given by Debye and Kohane. 
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APPENDIX. DENSITY GRADIENT METHOD 


A precision density method particularly well suited to 
measuring small differences is the density gradient 
method developed by Linderstrom-Lang’ to study the 
density change in liquids. Its adaptation to solids 
follows directly. 

As may be seen in Fig. 4, the gradient column 
consists of two bulbs connected by a tube. The lower 
bulb, is filled to some middle mark, M, with a solution, 
A, whose density is greater than the objects being 


J. D. Eshelby, J. Appl. Phys. 25, 255 (1954). 
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studied. The remainder of the column is filled with a 
similar solution, B, adjusted to have a density less 
than the material investigated. The two liquids are 
gently stirred in the region, M, and the thermostated 
column allowed to stand in order that a stable gradient 
in density free from convection is set up from A to B. 
An object of density between that of solutions A and 
B when placed in the column will come to rest at some 
position at which its density is matched by that of the 
liquid mixture. In order to establish this density, the 
density distribution in the column must be known. 
Of several possible methods, the most straightforward 
is to introduce several objects of calibrated density. 
The density of the unknown is readily determined by 
interpolation from the positions of the calibrated 
bodies. It is obvious that the precision increases with 
more calibrated pieces in the column, particularly if 
these are of nearly the same density as the sample 
under study. 

The density columns used were of the dimensions 
given in Fig. 4. With these dimensions, objects placed 
near the vertical axis of the column did not tend to 


move to the glass walls as readily as with columns of § 


smaller tube diameter. 

Many solutions cannot be used because of instability 
from moisture, air, or light. We successfully used 
solutions of iodobenzene [d=1.824(20°C/4°C)] 
with either diiodomethane [methylene _ iodide, 
d=3.325(20°C/4°C) ] or 1,3-diiodopropane (tri-methy- 
lene diiodide, d=2.561(25°C). These liquids were 
obtained from Eastman Kodak Company and were 
filtered through fritted glass before use. 

The density columns and solutions were maintained 
in a 94 gallon (American Instrument) giass-sided water 
bath controlled at 25.3° to +0.001°C. 

Calibration spheres about } in. diameter were made 
from a glass rod produced by fusing two tapered rods 
of different density laid side by side. From such a glass 
“grade,” spheres ranging in density from that of one 
glass to the other were readily melted and dropped from 
the rod. For these studies, Nonex (d=2.35) and 
uranium (d=2.27) glasses were used. [Corning 7720 
and 3321 respectively. ] 

In order to avoid calibrating an unnecessarily large 
number of spheres, the glass spheres together with a 
piece of silicon were placed in a relatively insensitive 
column. Those spheres coming to rest close to the 
silicon were recovered for calibration. The density of 
the spheres selected for calibration was determined 
by making up solutions of the same liquids used in the 
column and of a composition that would just float 
them. The density of these solutions was then detet- 
mined pycnometrically. The procedure is very laborious 
since the composition of the flotation solution must be 
such that finally one drop of added mixture not only 
floats the specimen but also represents a density change 
in the solution within the ultimate accuracy desired. 
These final adjustments were made with solutions 
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of such density that a drop added to the calibrating 
solution represented a stepwise change of about 
5xX10-* g/cc. Considerable time is necessary to 
determine the influence of each final drop since the 
solution must be free of currents and temperature 
equilibrium must be assured. 

The density of the solution found just sufficient to 
float a calibration sphere was determined in a volume- 
nometer pycnometer calibrated with boiled distilled 
water." The calibration is good to +0.00001 g/cc, 
based on a consideration of the measurement errors. 

The liquids used to measure density were made from 
two solutions, one of which would just float, and the 
other of which would just sink the calibration spheres. 
After a column had been prepared, it was possible to 
have two calibration spheres whose density differed by 
310X 10-* g/cc separated by about 10 cm in the column. 
The position of the specimens in the column was read 
on a cathetometer to +0.1 mm. The sensitivity of the 
method was therefore of the order 3X10~’ g/cc even 
though the densities are known on an absolute scale 
to only 10-5 g/cc. 


NN. E. Dorsey, Properties < iopmaeid Water (Reinhold 
Publishers, New York, 1940), p. 
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It was found convenient to have the silicon samples 
in the form of spheres. Spheres 3 to 4 mm diameter 
were prepared by tumbling polyhedra of silicon over 
silicon carbide paper in a circular “race-track.” The 
polyhedra were propelled by introducing a stream of air 
tangentially to the circular track. The spheres were 
etched bright and smooth in 3 HNOs, 1 HF, washed in 
distilled water, treated in HF, and again rinsed in 
distilled water. This treatment had been found by 
Gallagher and Blodgett to give a hydrophobic surface.” 
Successive treatments did not alter the density as 
determined from the position of flotation in a density 
column. 

Although much effort is required to set up and cali- 
brate a density column, once prepared the columns 
may be used for months. Samples may be removed and 
added using a platinum screen cup on the end of a wire. 
Care is necessary to avoid excessive agitation and time 
must be allowed for equilibration before dependable 
measurements may be made. Without sample changes, 
a column appears to lose less than 5 percent sensitivity 
in a month. 


2G. J. Gallagher and K. B. Blodgett (private communication). 
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Properties of Ohmic Contacts to Cadmium Sulfide Single Crystals 


R. W. SmitH 
RCA Laboratories Division, Radio Corporation of America, Princeton, New Jersey 
(Received September 21, 1954) 


The usual materials and techniques for making electrical con- 
tact to CdS single crystals are not satisfactory. The V-J char- 
acteristics are erratic and nonlinear, the photocurrent noise is 
excessive and there is a spurious photovoltaic effect, all associated 
with poor, high-resistance contacts. It has been found that In 
and Ga make ohmic contact to the crystals. With ohmic contacts 
the disturbing effects of the barriers are removed and measure- 
ments of the electrical characteristics of the crystals can be made 
with confidence. In addition, maximum performance can be ob- 
tained from the crystals. From the measurements it is shown that 
there can be large volume photosensitivity in CdS, and that the 
surface and volume (dark) conductivity of a crystal can differ 
greatly. Typical V—J characteristics are presented demonstrating : 


I. INTRODUCTION 


HE availability of excellent single crystals of 

cadmium sulfide with controlled characteristics 
and the ability to make both ohmic and rectifying con- 
tact to the crystals! makes possible, from relatively 
simple current-voltage measurements, a determination 
of some of the pertinent electrical characteristics of the 
crystals. In addition, ohmic, rectifying, and space- 
charge-limited types of behavior are demonstrated. 
Large space-charge-limited currents in an insulator 


‘R. W. Smith and A. Rose, Phys. Rev. 92, 857 (1953). 


(1) The improved performance obtained from a photosensitive 
CdS crystal with In contacts compared with that obtained with 
the Au contacts; (2) broad-area CdS crystal rectifiers. One was 
made from highly conducting crystals from which a forward cur- 
rent density ~1 amp/cm? was obtained at 1.5 volts with a front 
to back current ratio of 10°. The other, a rectifier-photoconductive 
cell, was made from an insulating photosensitive crystal. In the 
forward direction the gain was > 10‘ at 6 volts (corresponding to 
~10 amp/lumen). (3) Space-charge-limited current in an insula- 
tor. The current is time dependent and increases more rapidly 
with voltage than the square-law theoretically predicted for a 
perfect insulator. The effect of crystal imperfections in deter- 
mining the V-I characteristics is indicated. 


were predicted some 15 years ago by Mott and Gurney,’ 
but had not been observed because the two rather 
critical requirements, of (1) ohmic contacts to (2) a 
relatively perfect insulator, had not been met. It has 
been shown that® even in the case of CdS, the theo- 
retical square-law dependence of space-charge current 
on voltage is altered by imperfections in the crystal. 
The contrast between measurements made with poor 
electrical contacts and measurements made with ohmic 
2N. F. Mott and R. W. Gurney, Electronic Processes in Ionic 


Crystals (Oxford University Press, London, 1940), pp. 168-173. 
3 A. Rose and R. W. Smith, Phys. Rev. 92, 857 (1953). 
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contacts emphasizes the importance of the electrode 
problem in measurements on semiconductors and in- 
sulators. Commonly used materials (Ag, Au, Cu, graph- 
ite, Pt, ---) and techniques do not make good electrical 
contact to CdS crystals. With these electrodes the 
V-I characteristic is erratic and nonlinear, the potential 
distribution is not uniform along the crystal, the photo- 
current is excessively noisy, and there is an extraneous 
photovoltaic effect, all characteristic of high-resistance 
contacts. Because of the barriers the measurements are 
uncertain and the performance of the crystal is limited. 
It has been found, however, that indium (In) and 
gallium (Ga) do make ohmic, noise-free contact® to 
CdS crystals. With ohmic contacts the uncertainties 
in the earlier measurements have been removed and in 
addition maximum performance can be obtained from 
the crystals. The following criteria are used for maxi- 
mum performance for the crystal and the degree to 
which they are approached may be taken as a measure 
of the perfection of the contacts: (1) linear current- 
voltage characteristic, particularly at low voltages; 
(2) absence of photovoltaic effect; (3) maximum per- 
formance of the crystal as a photoconductor; (4) the 
noise associated with the photocurrent is limited only 
by the noise associated with the photon stream ; (5) the 
current passed by an insulating crystal is limited by 
the space-charge in the crystal and not by the supply 
of carriers available at the electrodes. From the V-J 
characteristics described in this paper it is seen that the 
above criteria are closely satisfied and that In and Ga 
do make excellent ohmic contact to CdS crystals. 


II. DESCRIPTION OF THE CRYSTALS 
1. Crystal Preparation 


The crystals used in this study were grown and 
treated by R. H. Bube, C. J. Busanovich, and S. M. 
Thomsen of these Laboratories. They were grown by 
the vapor phase technique.*:” If pure Cd and pure H2S 
are used, insulating crystals are obtained. In practice it 
seems easier to grow conducting crystals by the addi- 
tion of halogen to the reacting vapors and to subse- 
quently treat the resulting crystals by controlled im- 
purity additions, usually Ag or Cu.’ 


2. Physical Characteristics 


CdS crystals are normally n-type. Many tests have 
been made for hole conduction with generally negative 
results. The crystal resistivity can be controlled from 
~1 to >10" ohm-cm. The resistivity obtained for a 
particular crystal can depend on the way it is measured ; 
if for example, the measurement is made between two 
electrodes on the same side of the crystal an apparent 


4R. W. Smith, RCA Rev. 12, 350 (1951). 

5 Shulman, Rose, and Smith, Phys. Rev. 92, 857 (1953); C. I. 
Shulman, Phys. Rev. (to be published). 

6 R. Frerichs, Phys. Rev. 72, 594 (1947). 

7 R. H. Bube and S. M. Thomsen, J. Chem. Phys. 23, 15 (1955). 
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(surface) resistivity of the order of 10° ohm-cm may be 
obtained. If, however, the measurement is made with 
electrodes on opposite faces of the crystal a (volume) 
resistivity of the order of 10 ohm-cm may be obtained. 
An upper limit to the electron mobility seems to be 
~200 cm?/volt-sec. The energy gap between the filled 
band and the conduction band is 2.4 ev and there is 
ample evidence for traps distributed in the “forbidden” 
band.® For highly conducting crystals the current ratio, 
AI (photocurrent): J (dark) is less than unity. As the 
resistivity is increased, by impurity additions, the ratio 
increases to values as high as 108 for p of the order of 10° 
ohm-cm. The purer crystals are more insulating and 
rather insensitive photoconductors. 

The size and shape of the crystals depend critically 
on the growing conditions. The crystals used in this 
study were usually thin clear plates. The structure and 
degree of crystal perfection has been determined by 
J. Amick and G. Neighbor of these Laboratories and 
from their x-ray measurements the crystals appear to 
be excellent single crystals of the hexagonal form. 


III. ELECTRODES FOR CdS CRYSTALS 
1. The Rectifying Contact 


There is normally no problem in making rectifying 
contact to CdS crystals. In fact, the problem in the 
past was how to make ohmic contact. There are, how- 
ever, several applications where good rectifying con- 
tacts are needed. Air-drying Ag paste is easy to apply, 
and produces an excellent rectifying contact. Evapora- 
ted Au is good as a semitransparent electrode for a 
rectifier-photoconductive cell. As a guide it may be 
said that a high work function metal makes a good 
rectifying contact to CdS crystals. 


2. The Ohmic Contact 


Indium and gallium appear to be exceptional in that 
they do make noise-free, ohmic contact to CdS crystals. 
There is little difference in performance between the 
two elements. In (mp 155°C) and Ga (mp 29.8°C, 
usually a liquid at room temperature) are metals that 
are relatively stable in air. There is no apparent re- 
action of In or Ga with CdS and diffusion into the 
crystal is not necessary for an ohmic contact. In fact, 
simple pressure contact to the crystal has given ohmic 
characteristics. A simple and convenient technique of 
applying electrodes is to evaporate an opaque layer of 
In on the crystal. 


IV. THE CURRENT-VOLTAGE CHARACTERISTICS 
1. The Linear V-J Characteristic 


Figure 1 contrasts typical V-J characteristics for two 
similar, photosensitive crystals with different elec- 
trodes. The electrode separation and irradiation is also 
similar. Figure 1(a) is the characteristic with Au elec- 


® R. H. Bube, J. Chem. Phys. 23, 18 (1955). 
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trodes. The nonlinearity is obvious, the potential dis- 
tribution along the crystal is not uniform, the photo- 
current noise is excessive and there is a photovoltaic 
effect. Figure 1(b) is the characteristic with In elec- 
trodes. The curve is linear from low voltages up to the 
point where heating effects become important and is 
linear for a large range of light levels. Shulman® has 
measured the photocurrent noise and found that at the 
lower limit it can be determined by the noise of the 
absorbed photon stream. The photovoltaic effect can 
be negligibly small. When measured under high light 
conditions, maximum photoconductive performance®:!° 
is obtained from the CdS crystal. 


(b) 


_ Fic. 1. Oscilloscope traces of V-J characteristics of similar 
insulating, photosensitive CdS crystals. (a) Typical nonlinear 
trace obtained with the usual electrode materials, in this case Au. 
(b) Linear characteristics obtained with In electrode. 


LL 

* The performance of a photoconductor*” can be evaluated as 
follows: With a high level of light on the crystal, the effect of 
traps is minimized and the observed time constant, 70, approaches 
the life time, 7, of the free carriers. If under these conditions the 
experimentally determined quantities satisfy the expression for 


the gain, 0: 
6=1/eF=ryV/L, ’ (1) 


maximum performance is obtained from the photoconductor. The 
measured quantities are the photocurrent J (amp) for F absorbed 
photons per sec with V(volts) across electrodes L(cm) apart. 
The mobility ~~100 cm?/volt-sec for CdS and ¢ is the electronic 
arge. 
A. Rose, RCA Rev. 12, 362 (1951). 
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Fic. 2. Broad-area CdS crystal rectifier. One Ag and 
one In electrode on conducting CdS crystal. 


The curves of Fig. 1 were taken with low voltages 
across the crystal where the nonohmic character of 
poor contacts is emphasized. At higher voltages, the 
curve in Fig. 1(a) becomes more nearly linear and it 
was under these conditions that the expression (1) for 
the gain, 0, was originally* verified for CdS. 

With electrodes widely separated on the surface of 
the crystal it is difficult to determine whether the 
photoconduction is a surface or volume effect. One 
way of separating the two effects is to arrange the elec- 
trodes opposite one another on the thin section of the 
crystal and irradiate only the electrode area with light 
that can penetrate the crystal. If the gain expression (1) 
is satisfied, and if the gain is comparable to that ob- 
tained with the other electrode arrangement, a large 
volume photosensitivity is thereby demonstrated. For 
this test through the crystal lower voltages are used 
because the electrode separation is smaller. With Ag, 
Au, --- (nonohmic) contacts the disturbing effect of 
the barriers at the contacts introduced uncertainties in 
the application of (1). With In or Ga ohmic contacts, 
the barriers are not present and the volume photo- 
conductivity can be determined with confidence. The 
measurement has been made and there is a large volume 
photoconductivity in CdS. This has also been verified 
by measurements in the forward direction on a CdS 
crystal photoconductive rectifier to be presently de- 
scribed. 
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spot of green light. Since the electrode spacing used in Itt 
computing this performance was the crystal thickness, J *™ 
this is unambiguous evidence that the currents were fg Ut 
volume currents and that high photosensitivity can be vin 
obtained as a pure volume effect without recourse to a 
any special properties of the surface. - 
aga 
3. Space-Charge-Limited Current rea 
A perfect insulator with ohmic contacts should be rs 
able to pass an appreciable current. The expression 
derived by Mott and Gurney for the space-charge- and 
current in an insulator is "7 
J=10-"ypkV?/L', (2) crys 
Ao 5 Gotan touen of V-I Eamaeanattic 24 ee vor where J (amp/cm?) is the current density between plane J“! 
rectfer-photocondyctve cell Semitransparent Au eestode and Daralel electrodes with V (volts) across Z (cm). The ff 
Room irradiation. drift mobility » is 100 cm*/volt-sec and the dielectric § CU" 
constant & is 10, for CdS. For a typical CdS crystal, can 
2. The Rectifier Characteristics L is 5X10- cm, and the electrode area is 10-? cm?. The § irra 
— : ead space-charge current expected from the square-law (2) § line 
Rc — sacs 1 — ith nasa is 1 ma for V=35 volts. By contrast the ohms law volt 
pene pepo ing ghee assuming p= 10" ohm-cm, would be 10-" amp prec 
—- a — & Coeying commerce wre de row a for the same voltage. For CdS crystals, currents far in 
a pice tagees ry no broad — devices a excess of the Ohm’s law value are measured and in 
pin by the — dimensions) in ene ee th the fact under certain conditions the theoretical square- 
point contact rectifiers that are readily made with two law has bela anaas™ 
similar metals. Two types of CdS rectifiers have been Figure 4 shows the dynamic V-I characteristic, as 
= on from —, conducting — = 7 seen on an oscilloscope, when a 60-cps sine voltage is 
sas “! — meet eonenininipi 4 se te a - . if applied between In electrodes opposite one another on 
“ > cardiac © many orders of magnitude greater the thin section of an insulating CdS crystal. Linear 
ee ae Cannan photocurrent curves are obtained with light on the 
A. The Conducting Crystal Rectifier crystal (F; and F2 represent two different light levels). 
Figure 2 shows the dc characteristic of a conducting : 
(e~10 ohm-cm) CdS crystal with a Ag rectifying A pHOTOCURRENT 
contact. The current density in the forward direction 
at 1.5 volts is approximately 1 amp/cm? and the forward Jb, Jb 
to back current ratio is greater than 10°. The forward a | 
direction is with the Ag positive, consistent with n-type ; 
CdS. A photovoltaic effect of a few tenths of a volt 
has been measured. : 
1 
B. The Rectifier Photoconductive Cell i LF 
§ Vv, 1!00v 
The broad area rectifier-photoconductive device is i if [ - S 
made with an In electrode on one side of a thin (~5 { H SPACE-CHARGE CURRENT 
X10- cm), insulating, photosensitive crystal and a H 
semitransparent Au rectifying contact on the other ' i 
side. The cell is normally operated in the forward i | 
direction, Au positive. Figure 3 shows the V-J char- H | ! 
acteristic for a cell irradiated with room light. 
Under high light conditions and operated in the for- 
ward direction with 6 volts across the crystal, the ; 
observed time constant 7) was 2107? sec and a gain, ; Fic. 4. Space-charge current in an insulator. Sketch illustrat- 
a ing V-I characteristics, as seen on an oscilloscope, of insulating 
6>10* (corresponding to ~10 amp/lumen), was meas- CdS crystal with In electrodes opposite one another on the thin 
ured with both uniform irradiation and with a small _ section of the crystal. 60-cps voltage applied across crystal. Fr 
spot of green light. Maximum theoretical performance “up w. smith and A. Rose, following paper [Phys. Rev. 97, — 


1531 (1955)]. 


of the photoconductor was obtained with the small 
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If now the crystal is placed in the dark and the voltage 
amplitude (V) slowly increased, the curve a is traced 
out. With continued increase of V, the trace moves 
away from the J-axis to a3. If, when the trace reaches 
a;, the voltage amplitude is held fixed at Vi, the trace 
gradually falls to a4. If now V is turned to zero and 
again slowly increased, no trace is obtained until V 
reaches V; and then 3; is traced out and moves away 
from the J-axis toward b, with further increase in V. 
If V is reduced to zero, the crystal exposed to light, 
and then again placed in the dark, a similar sequence of 
curves is again obtained as V is changed. 

Figure 5 represents dc measurements on a similar 
crystal. For each increase in voltage there is a high 
transient current that slowly approaches a lower sta- 
tionary value. In the dark this is represented by the 
curve Jo. This is the steady space-charge current that 
can be passed by insulating CdS. If the crystal is 
irradiated with different light levels Zi, L2, and Ls, 
linear photocurrent curves are obtained from low 
voltages up to the point where the Jo characteristic 
predominates. 
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Fic. 5. Space-charge-limited current in an insulator. Dc meas- 
urement on crystal similar to that of Fig. 4. Zo is dark current 
curve and Fi, F, and F; are curves with increasing irradiation. 
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(Q) RECTIFYING CONTACT (b) OHMIC CONTACT 


Fic. 6. The contact between a metal and semiconductor. (a) 
Rectifying barrier obtained with high work function metals. 
(b) Ohmic contact. The density of carriers, No, available to the 
semiconductor at the interface in terms of the barrier height, 
(¢:—x2), is 


(¢1—xz2) ev 1.0 0.5 0.1 
No (cm) 10? 10” 10”” 


¢1 and ¢2 are the work functions of the metal and semiconductor 
respectively ; x2 is the electron affinity of the semiconductor. 


The ac and dc measurements contain essentially the 
same information and illustrate the important prop- 
erties of a thin insulating crystal of CdS with ohmic 
contacts, namely that the current in the dark is time- 
dependent and increases more rapidly than the square 
of the voltage across the crystal. With light on the 
crystal the photocurrent is proportional to the voltage 
until exceeded by the space-charge-limited current. 

Crystal imperfections or traps in CdS account for 
the time dependence of the space-charge current and 
for the deviation from the theoretical square-law V—J 
characteristic for a perfect insulator to a power law or 
exponential variation.’ More detailed experimental and 
analytical descriptions of the space-charge currents in 
CdS are given in papers by Smith and Rose" and by 
Rose.” Briefly the mechanism proposed is as follows: 
The ohmic contact to the insulator provides a reservoir 
of electrons with free access to the conduction band 
of the crystal. With the application of the electric 
field carriers are injected into the conduction band of 
the insulator and, for a perfect crystal, the current is 
given by the square-law (2). For an imperfect crystal 
most of the charge injected into the conduction band 
falls into traps where it cannot take part in conduction. 
Imperfections reduce the drift mobility, u, in a way 
determined by the density and distribution of traps. 
The initial burst of current observed when the voltage 
is changed is the current passed before appreciable 
trapping takes place. The subsequent decay of the 
current to a stationary value is the effect of charge 
being trapped from the conduction band. The rate of 
decay depends on the density of electrons in the con- 
duction band and the capture cross section of the traps. 


12 A. Rose, accompanying paper [Phys. Rev. 97, 1538 (1955) ]. 
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Finally, it can be shown that the distribution of traps 
can determine the form of the V-J characteristic. In 
particular, if the distribution of traps is uniform in 
energy below the conduction band the current varies 
ase". 


V. DISCUSSION 


In the previous sections, it is shown that the usual 
materials used for electrodes to CdS do not readily 
make good reliable contact. It is shown that In and Ga 
do make good ohmic contact. The degree of perfection 
is indicated by the noise measurements of Shulman,°* by 
the measurements of space-charge-limited currents, and 
by the low-voltage performance as a photoconductor. 

It is believed that these results can be interpreted in 
terms of the basic theory": of the contact between a 
metal and a semiconductor. The generally accepted 
view of the contact is shown in Fig. 6, which is drawn 
specifically for an -type semiconductor. When a metal 
of work function ¢; is joined to a semiconductor of work 
function ¢2, a dipole layer is formed such as to produce 
a potential drop from the metal to the interior of the 
semiconductor equal to (¢:—¢2). The height of the 
potential hill determines the degree of rectification of 
the contact and in particular determines the supply 
of carriers available to the semiconductor or insulator 
and the ease with which they can be injected. Metals 
for which ¢:>¢2 make rectifying contact, as for ex- 
ample Ag, or Au on CdS. For metals with low-work 
function, ¢1~@2, the barrier height may be negligibly 
small and the supply of carriers available at the inter- 
face large. In short the contact is ohmic. Most of the 
low-work-function metals are extremely reactive chemi- 
cally and so in practice are not suitable for electrodes. 


13 See reference 2, also pp. 174-185. 
44 J. Bardeen, Phys. Rev. 71, 717 (1947). 


SMITH 


The work function of In and Ga is estimated'*"” to be 
3-4 ev and the metals seem to be chemically suitable. 
Furthermore it has generally been found that there is 
poor correlation between the degree of rectification 
and the work function of the metal making contact to 
a semiconductor. Bardeen accounts for the lack of 
correlation in terms of a shielding effect of surface 
states on the semiconductor. It is suggested that the 
case of In on CdS is one in which the low-work function 
of In overbalances the effect of surface states to the 
extent of actually reversing the curvature of the band 
structure at the surface from a rectifying to an ohmic 
pattern. It is suggested then that the noble metals 
make rectifying, nonohmic contact to CdS because 
their work functions are large compared with CdS and 
that In and Ga make ohmic contact because their work 
function is equal to or less than that of CdS. 

The basic and important feature of an ohmic contact 
between a metal and a semiconductor is that the metal 
serves as a reservoir of carriers with free access to the 
conduction band of the semiconductor. This property 
of an ohmic contact, in addition to the fact that CdS 
can be made in relatively perfect single crystals, makes 
possible the observation of the space-charge-limited 
currents reported. The magnitude of the currents 
drawn indicate that the barrier height at the CdS-In 
interface must be less than or equal to 0.1 ev. 

The author acknowledges the suggestions and dis- 
cussions with Dr. A. Rose throughout this work. 


18H. B. Michaelson, J. Appl. Phys. 21, 536 (1950). 

16 Tt is instructive to plot the known work functions of the 
elements against their ionization potential. From this correlation 
one estimates a work function ~3 ev for In and Ga. 

17 By means of television scanning, A. D. Cope of these Labora- 
tories has measured the relative work functions of CdS, Ga, and 
Au. One face of a highly conducting CdS crystal was partly 
covered with separate Au and Ga dots. The landing potential of a 
low-velocity scanning beam was measured for each of the three 
regions. Assuming ¢au=4.9 ev we get ¢c.=3.6 and ¢cas=4.2 ev. 
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Currents as high as 20 amperes per cm? can be drawn through thin insulating crystals of CdS in the dark. 
A series of experiments demonstrate with a high degree of certainty that these are space-charge-limited cur- 
rents—the solid state analog of space-charge-limited currents in a vacuum. This conclusion is contrary to a 
recently published interpretation of similar observations on CdS crystals by Béer and Kiimmel. 

The use of pulsed voltages made possible the observation of currents close to those of a trap-free solid. The 
steady-state currents are many orders of magnitude lower than these but still many orders of magnitude 
higher than would be expected from the low-field resistivity of the insulator. The presence of traps deter- 
mines the form and magnitude of the steady-state current-voltage curves. Conversely, these curves become a 
sensitive tool for the measurement of trap densities. Trap densities computed independently from space- 
charge-limited currents and from photoconductive currents show reasonable agreement. 





INTRODUCTION 


Bios early band theory models of an insulator 
carried with them implicitly the suggestion that 
if free carriers could be injected into either the conduc- 
tion band or the valence band, these carriers could 
move freely through the solid. The magnitude of 
current that could be passed through a “perfect” 
insulator would be limited only by the space charge of 
the carriers themselves, just as the space-charge-limited 
currents in a vacuum diode. Mott and Gurney! derived 
the relation: 


(1) 


for the space-charge-limited current J through a slab 
of insulator d centimeters thick when V volts were 
applied. » and & are the drift mobility and dielectric 
constant respectively. It is interesting that this expres- 
sion leads to the expectation of some tens of amperes 
per square centimeter through an insulating sheet 
10 cm thick when ten volts are applied across opposite 
faces. 

The literature not only does not bear out these large 
currents but is almost devoid of any evidence for 
steady-state space-charge-limited currents. Gudden,? 
and many others since, have described the /ransient 
effects of space charge in insulators, chiefly in sup- 
pressing photo- or bombardment-induced currents. 
Weimer and Cope* cite evidence for small photo- 
generated space-charge-limited currents thin films of 
amorphous selenium. The currents were of the order of 
10-7 ampere/cm’, but they were nevertheless steady 
currents. The present work describes the evidence for 
large, steady space-charge-limited currents drawn 


I=10-"V*uk/d amperes/cm? 


* Presented at the June, 1953, meeting of the American Physical 
Society [R. W. Smith and A. Rose, Phys. Rev. 92, 857(A) (1953); 
A. Rose and R. W. Smith, Phys. Rev. 92, 857(A) (1953) ]. 

‘N. F. Mott and R. W. Gurney, Electronic Processes in Ionic 
Crystals (Oxford University Press, London, 1940), pp. 168-173. 

*B. Gudden, Lichtelectrische Erscheinugen (Verlag Julius 
Springer, Berlin, 1928). 
*P.K. Weimer and A. D. Cope, RCA Rev. 12, 314 (1951). 


through thin insulating crystals of CdS by means of 
ohmic contacts.* 


EARLY OBSERVATIONS 


The first observations that led to identifying the 
space-charge-limited currents are shown in Fig. 1 
which represents the V-J characteristics as seen on an 
oscilloscope. Sixty-cycle/sec ac voltages up to about 
100 volts were applied across a thin (~5X10~ cm) 
insulating CdS crystal having indium electrodes. 
With two different values of light on the crystal, the 
two linear characteristics F, and F2 were obtained. 
With the crystal in the dark and a small ac voltage 
applied, the curve a; was obtained. (If the amplitude 
of ac voltage is held fixed, the turned up ends of the 
a, curve tend to settle towards the voltage axis.) 
If the amplitude of ac voltage is increased toward Vi, 


I 


F 
'  PHOTOCURRENT 


a, b 


| ' 
| 

! 

| 
des 

| 
| 
| 
mE 
! 
Pe 


“yd 





(| 





SPACE- CHARGE CURRENT 


( 
‘ 
' 
! 
| 
| 
! 
| 
! 
! 
| 
' 
! 


| 
' 
| 
| 
| 
I 
] 





Fic. 1. Space-charge-limited current in an insulator. Sketch 
illustrating V-J characteristics, as seen on an oscilloscope, of 
insulating CdS crystal with In electrodes opposite one another 
on thin section of crystal. 60-cps voltage applied across crystal. 


4R. W. Smith, preceding paper [Phys. Rev. 97, 1525 (1955)]. 
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the curve a; “shoots up” to the form az and slides along 
the voltage axis with increasing amplitude of ac voltage. 
If the voltage amplitude is held fixed at Vi, the curve 
as settles down to the form a4. 

At this point the amplitude of ac voltage may be 
reduced to zero and again increased toward V;. Both 
during the decrease of voltage and during the increase 
of voltage the current remains vanishingly small 
until the voltage V; is reached at which point curve 
a, is retraced. Further increase of voltage from V; 
towards V2 causes a, to “shoot up” again to Jy, slide 
along the voltage axis to b, and settle down to 6; when 
the voltage amplitude is held fixed at V». 

The interesting features of these V-J characteristics 
are the time dependence and the highly nonlinear but 
symmetric form of the curves. 

The observation of a rapidly rising current-voltage 
curve is not in itself surprising. This is, indeed, a 
common observation in semiconductor measurements. 
And any one of several phenomena are commonly 
used to explain such curves. The phenomena include 
field emission from electrodes, from traps or from the 
valence band; collision ionization of trapped or valence 
electrons; poor contact; barriers; and heating effects. 
What was surprising was that none of these phenomena 
appeared to fit the observations. 

The use of ohmic contacts*:® ruled out poor contacts 
and field emission from the electrodes. The linear 
current-voltage curve under illumination ruled out 
collision ionization. The low fields (~10* volts/cm) 
ruled out field emission from traps or the valence band. 
The low currents ruled out heating effects. And finally, 
the assumption of internal barriers appeared improbable 
in the light of the otherwise regular performance of 
the crystal as a simple uniform photoconducting 
insulator. 

In addition, the following effects were noteworthy: 
The high burst of current when the voltage was raised 


RELEASE MECHANISM 
AND SHORTING BAR 


Tn ELECTRODES —= 
Neca s 


ELECTROMETER 


(a) 


Fic. 2. Schematic drawing of experiment to measure charge 
injection in an insulator. 


5 C. I. Shulman (to be published). 
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followed by a steady decline in current on standing; 
the increased resistance at a given voltage resulting 
from the previous application of higher voltage; the 
symmetry of the curves on the voltage axis. 


QUALITATIVE INTERPRETATION 


A quick appraisal of the expected properties of 
space-charge-limited currents in a solid, as modified 
by the presence of traps, indicated that most of the 
observations could be qualitatively accounted for by 
the assumption of such currents. These currents were 
all the more reasonable because the metal contacts 
were ohmic—that is, the contacts provided a reservoir, 
or excess of carriers, ready to enter the crystal as 
needed. A qualitative description of the properties of 
space-charge-limited currents in a solid follows. 

When a voltage is first applied across the crystal, 
space charge, in the form of free electrons from the 
cathode, is forced into the crystal via its conduction 
band. This free electron charge gives rise to a large 
burst of current. If the space charge remained in the 
conduction band, the peak value of the transient 
current would continue as a steady current. In actual 
crystals, however, one must take into account the 
effects of trap densities of the order of 10'* cm~*. The 
free charge forced into the conduction band settles into 
the traps, more or less rapidly, the rate being determined 
by the capture cross section of the traps. This accounts 
for the transient increase in current and subsequent 
decrease on standing. 

For a given voltage across the crystal a fixed amount 
of charge is forced into the crystal; most of this charge 
becomes trapped and only a small fraction remains free. 
Because the trap density is likely to be large compared 
with the density of space-charge electrons, one can 
take as a first approximation that all of the charge is 
condensed into traps. The condensation of electrons 
in traps raises the Fermi level towards the conduction 
band. To preserve a proper statistical equilibrium, the 
density of electrons in the conduction band must now 
be increased in accordance with this shift in Fermi level. 
The second approximation is then to allocate some of the 
condensed or trapped charge to the conduction band. 
These two steps usually give a fairly accurate approxi- 
mation as suggested by the following numerical example. 
Let the applied voltage force 10" electron charges per 
cm® into the crystal. Let this result in raising the 
Fermi level 0.1 volt towards the conduction band. 
If the crystal were an insulator having 10° free 
electrons/cm’ prior to the application of a voltage, it 
will now have about 10® free electrons per cm? to be 
consistent with the new position of the Fermi level. 
These 108 free electrons taken out of the 10 condensed 
electrons will obviously have a negligible effect on 
relocating the Fermi level. 

If the traps are more or less uniformly distributed in 
energy in the forbidden zone, equal increments in 
voltage will make equal energy shifts in the position of 
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the Fermi level. Since, however, the free electron 
density depends exponentially on the position of the 
Fermi level, the free electron density will increase 
exponentially with, or at least as some high power of, 
the applied voltage. This accounts for the observed 
high power dependence of current on voltage. By way of 
contrast, for a trap-free solid the current should increase 
only as the square of the applied voltage. 

The space-charge-limited currents were also 
consistent with the observed low fields, there being no 
theoretical threshold field for the observation of 
space-charge-limited currents. Finally, it was reasonable 
to expect that, when the crystal was exposed to light, 
the density of photo-generated carriers would exceed 
the density of space-charge-injected carriers and that 
these photo-generated carriers would dominate the 
behavior of the crystal and lead to the ohmic behavior 
shown in Fig. 1, : 


DIRECT EVIDENCE OF SPACE-CHARGE 


The interpretation in terms of space-charge-limited 
currents, with most of the space charge being trapped, 
suggested that one ought to be able to make a direct 
observation of this charge. Figure 2 shows the experi- 
mental arrangement used to make this test. 

The crystal was held by spring action between two 
indium tipped electrodes. The crystal was mounted in a 
light-tight box and poised over a metal pan connected 
to an electrometer. Leads from the two electrodes were 
taken through the light-tight box to a source of voltage. 
One electrode was grounded. The other electrode 
could be connected to the positive or negative terminal 
of a dry cell or to ground. Finally, by an external 
mechanical arrangement the crystal could be released 
from the electrodes and dropped into the electrometer 
pan. The following observations were made: 


(1) Both electrodes to the crystal were kept at 
ground and the crystal dropped into the electrometer 
pan. No charge was recorded. This indicated that the 
electrodes did not contribute any significant charge to 
the crystal by tribo-electric action. 

(2) One electrode was grounded and the other 
electrode was held at either plus 100 volts or minus 
100 volts relative to ground. A negative charge was 
recorded by the electrometer when the crystal was 
dropped into the pan. This test was subject to the 
criticism that charge from the electrodes might have 
in some way rubbed off onto the surface of the crystal. 
Hence the next test. 

(3) One electrode was kept at ground. The other 
electrode was first tapped on the plus 100 volt terminal 
of the dry cell and then returned to ground, so that 
both electrodes were at ground just before the crystal 
was dropped to the electrometer pan. A negative charge 
was recorded. The same negative charge was recorded 
when the second electrode was tapped on the minus 
100 volt terminal of a dry cell instead of the plus 100 
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volt terminal. The fact that the same negative charge 
was observed for either polarity of voltage applied to 
the crystal ruled out the possibility that the charge 
was due to a nonuniform resistance of the crystal. 
The internal charge would have changed sign in the 
latter instance. 

(4) The magnitude of the charge was about half 
that expected from space-charge considerations and 
the geometry of the crystal. The uncertainty in area of 
contact of the electrodes could easily account for this 
discrepancy. 

The procedure in item 3 was made possible by the fact 
that the space charge, forced into the crystal, was 
mostly trapped and remained in the crystal even when 
both electrodes were later grounded before the crystal 
was released. 


DC CURRENT-VOLTAGE CURVES 


Figure 3, curve Jo shows an early measurement of 
the current through a crystal 5X10-* cm thick when 
the crystal was kept in the dark. At each point the 
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Fic. 3. Space-charge-limited current in an insulator. Dc 
measurement on a crystal similar to that of Fig. 2. Jo is dark 
current curve and F;, Fs, and F; are curves with increasing 
irradiation. 
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Fic. 4. Space-charge-limited current in an insulator. Jo’ initial 
dark current curve after exposure to light, J) thermal equilibrium 
dark current curve. Fi, F2, and F; curves obtained with different 
light levels on the crystal. The conductivity, density of carriers in 
conduction band m,, and theoretical square-law curves calculated 
on basis of mobility 7.= 100 and dielectric constant k= 10 for CdS. 
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current was allowed to settle down to a reasonably bul 


stationary value. In the light of later experience it is J 1001 
not certain that sufficient time was allowed to get Mf sinc 
accurately stationary values. The curve, nevertheless, | Wh 
shows the high power dependence of current on voltage, J abo 
In this case the current increases approximately as the hig! 
fourth power of the voltage. furt 

Also in Fig. 3 are shown curves marked Fj, Fe, and @ in t 
F;. These curves were taken with small amounts of J hav 


light on the crystal, the light intensity increasing from T 
F, to F;. They are significant in showing that for the @ the 
same fields at which the dark current is increasing J in ¢ 
as the fourth power of the voltage, the photoconductive esta 
current increases only as the first power. These curves J and 
support the argument that the high power dependence § crea 
of the dark current cannot be ascribed to collision § high 
ionization processes. It is also evident from Fig. 3, § way 
that at higher light intensities the transition from ohmic § that 
behavior to space-charge-limited current behavior § conc 
occurs at higher voltages. This is consistent with the §§ Und 
argument that the behavior is either ohmic or space- § are 
charge-limited depending on whether the volume § close 
generated carrier density or the injected carrier density J of t 
predominates. curr 
Figure 4 shows perhaps the most significant set of J the . 
curves taken on a single crystal. The crystal thickness § curr 
was 2.5X10-* cm; the electrode area 5X10-* cm’. & settl 
Again, the curve marked J» represents the current proc 
through the crystal in the dark. This time, however, appli 
sufficient time was allowed for the current to come toa appli 
reliably stationary value at each voltage. At the low @ equil 
current end, the time required was of the order of hours. § It is 
The current increases as almost the 20th power of the § part 
voltage. Such a dependence* would be expected from §§ of th 
traps uniformly distributed in energy at least in the giver 
range of 0.55 to 0.8 volt below the conduction band. of a 
This is the range of Fermi levels appropriate to the M incre 
range of conductivities covered by curve Jo. It is also from 
to be noted from this curve that the dark conductivity J consi 


at fields below 10‘ volts/cm is well below 10-” Th 
(ohm-cm)~. curre 
The curves F, F2, and F; were taken with the crystal  cryst: 
exposed to increasing amounts of steady light. They @ to the 
show again, as in Fig. 3, the ohmic behavior obtained § Jo’ cu 
when the volume generated carriers exceed the space- f applic 
charge injected carriers. They offer additional evidence  feld 
that the electric fields (10* to almost 105 volts/cm) are “glow 
not sufficient to cause collision ionization or even to tatior 
significantly alter the ‘electron temperature.” expos 
and 
TIME DEPENDENT CURRENTS recom 
The curve J’ of Fig. 4 has an important bearing not wo 
only on the interpretation of the present work but also ‘a. 
on the interpretation of similar data reported by Béer.’ te 


Curve J,’ was taken with the crystal in darkness, n=y,. 


6 A. Rose, following paper [Phys. Rev. 97, 1538 (1955) ]. 
7K. W. Boer and U. Kiimmel, Z. Naturforsch. 9a, 177 (1954). 
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bul within a few minutes after it had been exposed to 
room light. Its particular shape is not to be emphasized 
since that depends on how rapidly the curve is taken. 
What is to be emphasized is that the Jo’ curve lies well 
above the Jo curve and that the Jo’ curve represents a 
higher than ohmic dependence of current on voltage. A 
further significant fact is that if one waited long enough 
in taking this data, each point on the Jo’ curve would 
have settled down to the J curve. 

The Jo’ curve may be understood as follows. After 
the crystal has been exposed to room light and is put 
in darkness, thermal equilibrium is not immediately 
established. The conductivity decays roughly as ¢' 
and so decays more slowly as the conductivity de- 
creases.°® During this decay process many of the 
higher-lying trapping states are filled. A short-hand 
way of describing the electron distribution is to say 
that the steady-state Fermi level! is closer to the 
conduction band than its final equilibrium position.® 
Under these circumstances, the space-charge electrons 
are injected through the conduction band into levels 
closer to the conduction band so that a larger fraction 
of the electrons remain free. This accounts for the 
currents of the Jo’ curve being higher than those of 
the Jo curve. In the final steady state condition, the 
currents and electron distribution of the J,’ curve 
settle down into those of the /o curve. This settling 
process which would occur in the dark with no voltage 
applied to the crystal is considerably hastened by 
application of a voltage since the rate of approaching 
equilibrium is proportional to the free electron density. 
It is this hastening of the decay that accounts in large 
part for the ac observation described at the beginning 
of this paper, namely, the increase in resistance at a 
given voltage resulting from the previous application 
of a higher voltage. Curve Jo’ was taken rapidly with 
increasing voltages. When the voltage range is retraced 
from high to low voltages a curve is obtained lying 
considerably below Jo’ and close to Jo. 

The last statement describes also the character of 
current-voltage curves reported by Béer for CdS 
crystals. Béer’s interpretation is diametrically opposed 
to that taken here. He states that the high currents of the 
Ty’ curve are a result of the emptying of traps by the 
applied field, either through collision ionization or 
field emission from traps; in brief, a field-induced 
“glow curve.” What is common to both interpre- 
tations, Béer’s and ours, is that the crystal after 
exposure to light, is left in a nonequilibrium condition 
and that, to approach equilibrium, electrons must 
recombine with deep-lying trapped holes created by 
previous optical excitation. In Béer’s interpretation the 


®R. W. Smith, RCA Rev. 12, 350 (1951). 

°A. Rose, RCA Rev. 12, 362 (1951). 

The steady-state Fermi level is defined by the relation 
n=N,. exp(—E;/kT), where n is the density of free electrons, 
¥. is normally about 10" at room temperature, and Ey is the 
energy difference between the bottom of the conduction band and 
the steady-state Fermi level. 
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electrons that recombine with the deep-lying holes 
come from higher-lying states and must first be excited 
into the conduction band by the applied field before 
recombining. While they are in the conduction band 
they would account for the excess currents of curve 
Io’. As they drop into the deep-lying holes, the curve 
Io would be approached. 

Our interpretation is that up to fields of about 
5X10* volts/cm, the field is not strong enough to 
excite electrons from high-lying states into the conduc- 
tion band. The electrons that recombine with the deep- 
lying holes are those that are injected into the con- 
duction band from the cathode. The injection into 
the conduction band provides the initially high currents. 
The recombination of conduction electrons with 
deep-lying holes accounts for the approach to the low 
currents of curve Jo. The approach to Jo is accelerated 
by the increased density of conduction electrons 
injected by the applied field. This approach takes place 
even in the absence of an applied field but much more 
slowly consistent with the low density of conduction 
electrons. The chief evidence for the present interpre- 
tation in terms of space-charge-limited currents is the 
fact that the electric fields are too low to extract 
electrons from or to ionize traps. Evidence of nearly 
equal importance, however, is contained in the curve 
marked ‘“‘pulse measurement” in Fig. 4. 


Pulse Measurements 


Since space charge is injected first into the conduction 
band and then becomes largely trapped, one might 
expect to see the high theoretical currents characteristic 
of a trap-free solid if one looked fast enough after apply- 
ing a voltage. Such a measurement was made using a 
voltage pulser" and an oscilloscope. Currents several 
orders of magnitude higher than those of curve [o 
were observed but they still fell far short of the theo- 
retical curves shown in Fig. 4, both in magnitude and 
form. Since the oscilloscope could not resolve time 
better than about 100 microseconds, it was felt that 
some of the charge might be trapped before one could 
see its contribution to the pulsed current. 

It was found, however, that if a small amount of 
steady bias light were used, much higher values of 
pulsed current could be observed. The curve marked 
“pulse measurement” in Fig. 4 was taken in this way. 
Two characteristics of this curve are immediately 
striking. The magnitudes of the currents are close 
to the theoretical values for a trap-free solid computed 
from Eq. (1), using a mobility of 100 cm?/volt sec. 
Especially at the lower fields these currents are well 
over 8 powers of ten higher than those of curve Jo. At 
the high end, the current densities reach 20 amperes/ 
cm’. The second significant characteristic is that the 
shape of this curve quite accurately follows the square- 


11 We are indebted to Dr. L. S. Nergaard for the use of this 
pulser which he had designed for use in oxide cathode work. 
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law dependence on voltage required by Eq. (1). Not 
only does the curve satisfy the magnitude and form 
of space-charge-limited currents in a trap free solid 
but it implies also two other important conclusions. 
The accurate square-law dependence on voltage means 
that in this range of fields, up to 5X10* volts/cm, the 
mobility is not field-dependent. The electron tempera- 
ture does not depart significantly from the crystal 
temperature. Also, for currents up to 20 amperes/cm’ 
the indium contacts are still ohmic, that is, they still 
supply a reservoir or excess of electrons at the metal- 
insulator interface. This must mean that the Fermi 
level of the indium metal is within about three-tenths 
of a volt of the conduction band of the CdS crystal. 

Traces of the transient currents obtained with pulsed 
voltages are shown in Fig. 5. The decay time for these 
currents is of the order of a millisecond. This is also the 
decay time observed at high light intensities for photo- 
currents in this same crystal. The traps into which the 
space-charge electrons and the photoelectrons decay 
are either the same or at least have the same capture 
cross section. This cross section was computed from 
the relation s=(7vm)-! cm? to be 10~” cm?. 7 is the 
decay time, 10-* second; v the thermal velocity of an 
electron, 10’ cm/sec; and is an estimate of the éotal 
number of traps into which electrons may decay, 
10'5/cm*. 

TEMPERATURE DEPENDENCE 


The injection of space-charge electrons into an 
insulator transforms it into a° semiconductor of in- 
creasing conductivity as the voltage is increased. At 
any given voltage the current should vary with tem- 
perature in a similar fashion as for any semiconductor 
having a conductivity of the same order of magnitude. 
The total space-charge in the crystal should not vary 
with temperature; but the fraction of space-charge 
that is free should in general increase exponentially 
with increasing temperature. In reference 6 the following 
relation was obtained to describe the effect of tempera- 
ture on the form of the current-voltage curve: 


[Ta ViTeT)H, 


T, is a characteristic temperature describing the 
distribution of traps in energy. Small values of 7, are 


T(o 
{ 
2 
3 


t—~ 20Ms 


Fic. 5. Current in CdS crystal due to square voltage pulses of 
20-millisecond duration. Curves 1, 2, and 3 correspond to initial 
currents Jo of 6X10-*, 2X10-*, and 3X10-* amp respectively. 
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Fic. 6. Space-charge-limited current in CdS as 
a function of temperature. 


associated with trap distributions varying rapidly 
with energy, while large values of 7, approximate a 
slowly varying trap distribution. From Eq. (2), one 
would expect a steeper current voltage curve at lower 
temperatures. Figure 6 shows three current-voltage 
curves taken at different temperatures. Qualitatively 
the currents are smaller for lower temperatures, and 
the curves steeper at lower temperatures in accordance 
with the expected behavior. Quantitatively, the ratio of 
currents between the 371°K and the 300°K curves are 
of the right order. The currents of the 77°K curve 
should be much lower. It is possible that the electron 
distribution in the crystal did not come into thermal 
equilibrium with the crystal temperature. At these low 
carrier densities, approach to thermal equilibrium may 
easily require hours or days. 


COMPARISON OF SPACE-CHARGE-LIMITED CURRENT 
WITH PHOTOCONDUCTIVE CURRENT 

The energy distribution of traps determines the 

form of the current-light curve in photoconductivity 

measurements and also the speed of response. The 

energy distribution of traps also determines the form 

and magnitude of the space-charge-limited current- 
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voltage curves. Conversely, from the observed curve 
forms and speeds of response one should be able to 
obtain the energy distribution of traps. A pertinent 
question is whether the trap distribution determined in 
the same crystal by the two independent methods of 
space-charge-limited currents and photoconductive 
currents shows any self consistency. Table I summarizes 
the appropriate relations derived in references 6 and 9. 

In the limit for large values of 7., the trap distri- 
bution becomes uniform in energy, the space-charge- 
limited current increases exponentially with voltage 
and the photocurrent increases linearly with light 
intensity. Figure 4 shows the exponential dependence 
of space-charge-limited current on voltage. Figure 7 
shows the linear dependence of photo current on light 
intensity. Both sets of data were taken on the same 
crystal through the same electrodes. Both sets of data 
are consistent with a uniform or near uniform energy 
distribution of traps in the range of 0.5 to 0.8 ev below 
the conduction band. 

The trap density computed from the space-charge- 
limited curve Io of Fig. 4 was 5X10” traps/cm* in an 
energy range of kT at 0.7 volt below the conduction 
band. From photoconductivity measurements at the 
light levels of curves F; and F2 of Fig. 4, the trap 
density was computed to be 0.5X10"/cm* per kT at 
the same depth below the conduction band. Of the two 
estimates the space-charge-limited computation is 
likely to be the more reliable since the character or 
capture cross sections of the traps do not enter in. In 
the photoconductivity measurement, only those traps 
are measured from which thermal excitation occurs 
rapidly enough to keep pace with the decaying photo- 
current (see reference 9). 


TABLE I. The energy distribution and density of traps derived 
from data on photoconductivity and from data on space-charge- 
limited currents.* 








Space-charge- 
limited currents Photoconductivity 


Form of current curve Tax Y(Te/T +1 [a FTel/(T+Te) 

Trap density in AQ ‘To 
tange of kT near —- —nNe 
the Fermi level e t 











* Notes: 7. defines the trap distribution in the relation 2:“exp(—E/ 
kT), where nz is the trap density at E volts below the conduction band. F 
is the number of optical excitations per second. AQ is the space charge 
forced into the crystal when the voltage is increased by an amount suffi- 
cient to double the current. 7o is the observed speed of response of the 
Photoconductor. 7 is the lifetime of a free carrier in the conduction band. 
we is the density of free carriers at which ro is measured. 
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Fic. 7. Photocurrent-light (F) characteristic 
of CdS crystal. 


The time of response at Ff; was 50 seconds and at F» 
was 6.6 seconds. At high light intensities the time of 
response leveled off at a value of 10-* second, which 
should be close to the true lifetime of a carrier in the 
conduction band. 


CONCLUDING REMARKS 


Currents far in excess of ohmic currents have been 
measured through thin insulating crystals of CdS in 
the dark. A set of experiments has been described that 
identify these currents as space-charge-limited currents 
in a solid. These experiments include the direct measure- 
ment of the space charge; the matching of the form 
and magnitude of the theoretically expected current 
in a trap-free solid by using pulsed voltages; the 
analysis of the much lower steady-state currents in 
terms of a trap distribution that is consistent with 
independent photoconductivity measurements; and the 
confirmation that these currents can be obtained at 
fields too low to cause collision ionization or field 
emission from traps. The fact that the steady-state 
space-charge-limited currents are more than eight 
powers of ten lower than those in a trap-free solid 
(Fig. 4) is evidence that space-charge-limited currents 
represent one of the most sensitive tools for measuring 
the presence of traps, especially in those insulating 
crystals that approach a high degree of perfection. 
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Currents, far in excess of ohmic currents, can be drawn through thin, relatively perfect insulating crystals. 
These currents are the direct analog of space-charge-limited currents in a vacuum diode. In actual crystals, 
the space-charge-limited currents are less than their theoretical value for an ideal crystal by the ratio of free 
to trapped carriers. Space-charge-limited currents become, therefore,.a simple tool for measuring the im- 
perfections in crystals even in the range of one part in 10". 

The presence of traps not only reduces the magnitude of space-charge-limited currents, but also is likely 
to distort the shape of the current-voltage curve from an ideal square law to a much higher power depend- 
ence on voltage. The particular shape can be used to determine the energy distribution of traps. 

The presence of traps tends to uniformize the charge distribution between electrodes, to introduce a tem- 
perature dependence of the current, and to give rise to certain transient effects from which capture cross 


sections of traps may be computed. 


Space-charge-limited currents offer another mechanism for electrical breakdown in insulators. 





I, INTRODUCTION 


HE solid state analog of space-charge-limited 
currents in a vacuum diode are the space-charge- 
limited currents in an insulator. This was clearly pointed 
out at least fifteen years ago as a simple consequence of 
the band theory of solids.' 

While there have been many references, as in the 
work of Hilsch, Gudden, and Pohl,” to the transient 
effects of space charge in solids, there have not been 
until recently direct measurements of steady-state 
space-charge-limited currents.*~* The lack of such meas- 
urements is remarkable since simple theory allows 
amperes per square centimeter of space-charge-limited 
current to be passed through thin sheets of insulators. 
Two requirements, however, need to be fulfilled in 
order to observe space-charge-limited currents of sig- 
nificant magnitude: At least one of the two electrodes 
must take ohmic contact®” to the insulator and the 
insulator must be relatively free from trapping defects. 
The concept of an ohmic contact to an insulator is 
perhaps not a common one and needs to be defined. 
An ohmic contact is used here to mean an electrode 
that supplies an excess or a reservoir of carriers ready to 
enter the insulator as needed. The virtual cathode 
formed in front of a thermionic emitter in a vacuum 
diode is a familiar example of an ohmic contact to the 
insulating vacuum space between cathode and anode.*® 
The current through the vacuum diode or between 


1N. F. Mott and R. W. Gurney, Electronic Processes in Ionic 
Crystals (Oxford University Press, New York, 1940), p. 172. 

2B. Gudden, Lichtelectrische Erscheinungen (Verlag Julius 
Springer, Berlin, 1928). 

3P. K. Weimer and A. D. Cope, RCA Review 12, 314 (1951). 

4A. Rose, RCA Review 12, 362 (1951). 

5 R. W. Smith and A. Rose, Phys. Rev. 92, 857 (1953); A. Rose 
and R. W. Smith, Phys. Rev. 92, 857 (1953). 

®W. Shockley and R. C. Prim, Phys. Rev. 90, 753 (1953); 
G. C. Dacey, Phys. Rev. 90, 759 (1953). 

7 R. W. Smith, this issue [Phys. Rev. 97, 1525 (1955)]. 

8L. S. Nergaard [RCA Rev. 13, 464 (1952)] proposes a model 
of an oxide cathode in which the flow of current within the cathode 
coating itself, as well as in the vacuum just outside the cathode, 
may be space-charge-limited. 


electrodes in an insulating solid does not depend on 
the amount of excess carriers as long as there is an 
excess. 

Figure 1 shows one example of an ohmic contact to 
an insulator obtained by the use of a metal whose work 
function is less than that of the insulator. The presence 
of the virtual cathode is evident in Fig. 1(b). 

The requirement of relative freedom from traps will 
be made quantitative later. For the present, it is 
sufficient to point out that traps lower the drift mobility 
of carriers and thereby the magnitude of the space- 
charge-limited currents.‘ Trap densities of 10!8/cm' 
(not unreasonable for the usual polycrystalline insula- 
tor) would be sufficient to reduce the space-charge- 
limited currents to almost unmeasurable values. 

The measurements of space-charge-limited currents 
reported by Smith? are on relatively perfect insulating 





Fic. 1. (a) Ohmic contacts to an insulator at zero applied 
— (b) Finite field applied to Fig. 1(a) showing virtual cathode 
at A. 


®R. W. Smith and A. Rose, preceding paper [Phys. Rev. 97, 
1531 (1955)]. 
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crystals of CdS having ohmic contacts. Even so, there 
are a sufficient number of traps that the simple model 
for space-charge-limited currents needs to be modified 
as in the following analysis to take their effect into 
account. The traps not only reduce the magnitude of 
the space-charge-limited current, but also distort the 
shape of the current-voltage curve and add certain 
interesting and informative transient effects. 

The analysis of space-charge-limited currents is 
carried out in terms of the following approximate but 
simple formalism. Let the space between two electrodes 
have a capacitance C. This is an approximating con- 
cept. In the case of plane parallel electrodes the ca- 
pacitance is that between the two electrodes. The charge 
that can be accommodated in the interior space is 


Q=ClV, (1) 


where V is the applied‘ voltage. 
The space-charge-limited current is immediately 
given by 


I=0Q/T, (2) 


where T is the transit time of the charge Q between 
electrodes. 

The well-known expressions for space-charge-limited 
currents in vacuum and in a trap-free insulator are 
readily derivable from Eq. (2). They are given here to 
clarify the formalism. 

Il. VACUUM DIODE 


The space charge forced into the vacuum diode per 
cm? of plate area and for a plate separation of d cm is 


Q=CV=(V/4rd) X10-” coulomb. (3) 


The transit time of the charge Q between plates is 
approximately 


T=d/(6X10"X V3) sec. (4) 


The space-charge-limited current is, from (2), (3), 
and (4) 


I=5X10-°(V3/d?) amperes/cm?. (5) 
The accurate value of the coefficient is 2.3 10-°. 


III. TRAP-FREE INSULATOR 


The space charge forced into an insulator per cm? 
of plate area is, from Eq. (1) 


Q= (Vk/4rd) X10-” coulomb; (6) 


k is the dielectric constant of the insulator and d the 
electrode spacing. The transit time of the charge Q be- 
tween electrodes is 


T=d/Ep=@?/Vu. (7) 


E is the electric field in the insulator and yu the drift 
mobility. From Eqs. (2), (6), and (7) the space-charge- 
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Fic. 2. Insulator having shallow traps in thermal equilibrium 
with electrons in the conduction band. 


limited current is 
I=10-"(V?uk/d*) amperes/cm?. (8) 
The accurate value of the coefficient is also 10-. 


IV. INSULATOR WITH SHALLOW TRAPS 


Let the insulator have only shallow traps (Fig. 2), 
that is, traps lying close enought to the conduction 
band to be in thermal equilibrium with electrons in the 
conduction band. The same expression for the space- 
charge-limited current will be obtained as in the case 
of the trap-free insulator. One need only insert for the 
drift mobility the product of the drift mobility for free 
carriers and the fraction of the total space-charge 
that is free. While the same total charge is forced into 
the insulator as in the case of the trap-free insulator, 
only a fraction of this charge is free. The drift mobility 
must be reduced by the same fraction. The value of 
this fraction is determined by the number and depth 
of traps and is not dependent on the applied voltage.” 
Accordingly, the space-charge-limited current has the 
same square-law dependence on voltage as in the 
simple trap-free model of Eq. (8). 

Let the fraction of free charge be @. The space- 
charge-limited current is then given by 


T=10-"[V?(uo0)k/d* ] amperes/cm?, (9) 


where wo is the drift mobility of free carriers. 

If there is a single level of shallow traps whose density 
is V,/cm* and whose distance from the conduction band 
is E volts, the fraction @ is given at room temperature 
by the approximate relation 


0=(N./N ie *!*?, (10) 


where V.=10" at room temperature. For V,=10" 
and E=0.5 volt, 2=10~7 and the space-charge-limited 
currents are sharply reduced. 


V. INSULATOR WITH TRAPS DISTRIBUTED 
IN ENERGY 


Consider, as shown in Fig. 3, an insulator in which 
the traps are distributed uniformly in energy below the 
conduction band. The prominent characteristics of 
this model are a consequence of the distribution of 


10 The electron temperature is assumed here to be the same as 
the crystal temperature. 
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Fic. 3. Insulator having a distribution of traps in energy 
and showing the shift in Fermi level due to charge injected by 
an applied field. 


traps in energy and not of the strict uniformity of dis- 
tribution. For a given applied voltage, the charge, Q, 
forced into the insulator, is distributed in three major 
parts: free charge in the conduction band, trapped 
charge above the newly determined Fermi level, and 
trapped charge condensed in the states between the 
original Fermi level and the newly determined Fermi 
level. Since the condensed charge is likely to be very 
nearly the total charge, the new location of the Fermi 
level is given very closely by considering all of the 
charge Q to be condensed. With this approximation, 
the shift in Fermi level will be proportional to the 
space charge Q which is, in turn, proportional to the 
applied voltage V. 
We can write for the free carrier density 


ai ue, (11) 
Here J, is the number of states-in the bottom kT slice 
of the conduction band, £; is the original distance of 
the Fermi level from the conduction band and AE is 
the shift in position of the Fermi level owing to the 
condensed charge Q forced into the insulator by the 
applied voltage V. Also, from previous remarks: 


AE=(Q/end=VC/end, 


where m; is the number of traps per cm* per unit range 
in energy and e the electron charge. From Eggs. (11) 
and (12) the free carrier density is given by 


(12) 


N= N eo EslkT eVClnedekr 


(13a) 
(13b) 


where -o is the initial, thermal equilibrium concentra- 
tion of free carriers and a is used for C/n,dekT. 

The density of trapped carriers is very nearly equal 
to the total density of injected electrons, or 


VC/de. 


_ nat", 


density of trapped carriers=Q/de= (14) 


The fractional value of free charge is, from Eqs. (13) 
and (14), 


0= (eneod/VC)e*". (15) 


6 is no longer a constant as in the previous case of shal- 
low traps, but depends exponentially on the applied 
voltage. From Eqs. (9) and (15) the space-charge- 


limited current becomes: 
T=10-"(Vyok/d*) (eneo/C)e*’. (16) & 


What is significant in Eq. (16) is that, owing to the 
distribution of traps in energy, the space-charge-limited 
current now increases exponentially with voltage com- f 
pared with the square law dependence on voltage ob- & 
tained in the trap-free and in the shallow-trap models. 
The exponential dependence is a consequence of the 
assumption of a uniform distribution of traps. If the FF 
uniform distribution of traps is replaced by one that 
decreases with distance from the conduction band, the F 
exponential is replaced by a high power function of the fF 
voltage. 
In particular, let the steepness of the trap distribu- 
tion be approximated by a characteristic temperature § 
T. such that 


myx E~ EIR, (17) 


where E is measured from the bottom of the conduction 
band. Small values of T, lead to trap distributions vary- 


ing rapidly with energy, while large values of 7, fF ii 


approximate a slowly varying trap distribution. The 
voltage dependence (see Appendix I) of space-charge- 
limited current is (for T.>T) : 

Jo ViTelT+1, (18) B 
For T.<T, this reduces to the case of shallow traps f 
where the exponent of V is 2. 


VI. TRAP DISTRIBUTION FROM I VS V CURVE 


One can expect to work backwards from an experi- § 
mentally determined current-voltage curve to obtain 
the energy distribution of traps. Equation (18), for 
example, gives the trap distribution for experimental 
curves for which the current increases as a power of the 
voltage. For a current-voltage curve of arbitrary form, 
and for currents increasing faster than V?, the following 
analysis may be made. From Eq. (9) one may write 


T=constantVe4#/*?, (19) 
aI I V d(AE) 
aV V kT dV 
The solution of Eq. (20) for dAE/dV is 
dAE (- al ' kT 
dv \rav Jv 


and 
(20) 


Since the charge condensed in traps is 
Q=VC, 
Eq. (21) may be rewritten as 


dAE (- dl kT 


? 


c—= 1 
Iav Jv 


aQ 
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(23) 


-j 
1}. 
dAE kT\I dv ) 


In Eq. (23), dQ/edAE is the number of traps per unit 
| energy range in the volume of specimen under test. 
A simple operational interpretation of Eq. (23) is the 
following. If one increases the applied voltage by an 
} amount AV sufficient to double the current, the number 
) of electron charges forced into the insulator is AVC/e. 
This number is also the number of traps in a range kT 
» near the Fermi level. 


dQ —(- dl 


VII. COMPARISON OF SPACE-CHARGE-LIMITED 
CURRENT WITH PHOTOCONDUCTIVE 
CURRENT 


The last two sections have shown how the form and 
magnitude of the trap distribution may be computed 
} from the space-charge-limited current-voltage curve. 


TaBLE I. The energy distribution and density of traps derived 
from data on photoconductivity and from data on space-charge- 
» limited currents.* 








Space-charge- 


limited currents Photoconductivity 





To Fel (T+T,) 


(t0/7)ne 


Lo Vitet/T 


AQ/e 


Form of current curve 
Trap density in range of 
kT near the Fermi level 








 *Notes: Te defines the trap distribution by Eq. (17). F is the number of 

optical excitations per second. AQ is the charge forced into the insulator 
when the voltage is increased by an amount sufficient to double the current. 
7) is the observed response time of the photoconductor to interrupted light. 
> 7+ is the lifetime of a free carrier in the conduction band. me is the density 
» of free carriers at which ro is measured. The Fermi level is defined by the 
relation: me =Nce~Bs/kT = 109e-Bs/k7, 


In reference 4, it was argued that the same information 
on trap distribution could be obtained from data on the 
» form of the photocurrent-light curve and from data on 
the ratio of lifetime to observed time constant. 

The results of the two analyses are summarized in 
Table I. 

The characteristic temperature, computed from the 
space-charge-limited currents, should be more reliable 
than the characteristic temperature computed from the 
| photoconductive currents. In the analysis of the latter 
an implicit assumption was made that all of the traps 
had the same capture cross section for electrons. The 
validity of this assumption is under study and must, 
in any event, be tested for each new crystal. There are 
some observations that require the presence of more 
than one type of trap and such mixtures can alter the 
form of the current-light curve. The form of the space- 
charge-limited current-voltage curve on the other hand 
should not be dependent on the capture cross section 
of the traps. Reasonable agreement between the two 
independent methods of measuring trap distributions is 
reported by Smith and Rose® and by Bube." 


(1956) H. Bube and S. M. Thomsen, J. Chem. Phys. 23, 15 
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VIII. FIELD AND CHARGE DISTRIBUTION 
BETWEEN ELECTRODES 


For the simple case of trap-free insulator and plane 
parallel electrodes, the following relations for current, 
field, and charge distribution are known: 


I« 2 
E« x! 


(24) 
(25) 
(26) 
where E is the electric field, x the distance from the 
cathode (for electron injection), and p the space charge 
density. 


In Appendix II it is shown that, in general, when 
traps are present and when 


[a Yur, 


p= at, 


n>1, (27) 


the field and charge distributions take on the forms 
E« gn (ntl) (28) 
(29) 


pa 4 (nt1) | 


For large values of m, the space charge density ap- 
proaches a uniform distribution over most of the dis- 
tance between cathode and anode. The free charge 
density, however, must always vary as the reciprocal 
of the field in order to keep the divergence of the cur- 
rent zero. Since the free charge is usually a negligible 
part of the total charge, it may undergo large varia- 
tions without having significant effect on the distribu- 
tion of the total charge. 

The relative uniformity of charge density between 
cathode and anode leads one to expect only small or 
negligible currents when these electrodes are shorted 
together. The space charge flowing out of the insulator 
tends to flow out equally at both ends. Smith® has ob- 
served the short circuit current to be negligibly small. 
This is to be contrasted with the relatively large short- 
circuit reverse currents obtained from dielectric ab- 
sorption effects as in some glasses. 


IX. TRANSIENT EFFECTS 


The following observation on space-charge-limited 
currents in CdS crystals is reported by Smith.® A 
sudden increase in voltage causes the current to transi- 
ently increase to very high values. In a matter of 
seconds or minutes the current subsides to a much 
smaller stationary value. The interpretation is that the 
sudden increase in voltage forced a corresponding in- 
crease of charge in the conduction band. In the course 
of seconds, most of this free charge settles into traps 
and one observes the rapid decay of current. The time 
required for the transient current to subside is a direct 
measure of the capture cross section of traps for free 
electrons. 

If the space-charge-limited current has attained a 
stationary value at a given voltage it is found® that 
lowering the voltage from this value may cause the 
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(Cc) — FINAL CONDITION 


Fic. 4. Series of potential patterns showing the transient 
effects when the applied voltage is reduced. 


current to “undershoot” its new stationary value. The 
interpretation here is shown in Fig. 4. Figure 4(a) 
shows the stationary potential distribution at an applied 
voltage V;. When the voltage is lowered to 3V1, it 
requires some time for the trapped space charge forced 
into the crystal at V; to be thermally released. Before 
this trapped charge is thermally released, a space- 
charge barrier is presented to the cathode, as shown in 
Fig. 4(b). This is not a virtual cathode type of barrier. 
It actually suppresses the entrance of electrons from 
the cathode into the insulator. As time goes on, the 
trapped charge is thermally released and the potential 
distribution arrives at the new stationary value shown 
in Fig. 4(c). If the thermal release of carriers is suffi- 
ciently slow (traps of small capture cross section) the 
current can “undershoot”’ its final value. If the thermal 
release is fast there will be no “undershoot” but actually 
an “overshoot.” 


X. TEMPERATURE DEPENDENCE 


The injection of space charge into an insulator con- 
verts it into a semiconductor of increasing conductivity 
with increasing voltage. At any given voltage the cur- 
rent should vary with temperature as would any semi- 
conductor having the same conductivity. (This does 
not mean that the temperature variation of conduc- 
tivity is determined only by the conductivity. As in 
any semiconductor the trap distribution governs the 
temperature dependence.) An increase in temperature 
does not alter the total amount of space charge, but 
does increase the fraction of this space charge in the 
conduction band. 


Equation (18) indicates that lowering the tempera- 
ture should make the current-voltage curve steeper. 
For very steep curves, the effect of lowering the tem- 
perature should be one of shifting the current-voltage 
curve along the voltage axis toward higher voltages. 
To match the same current the Fermi level must be 
closer to the conduction band at lower temperatures 
and this requires higher voltages according to Eq. (12). 


XI. TRANSITION FROM OHMIC TO 
SPACE-CHARGE-LIMITED 
CURRENTS 


Space-charge-limited currents increase as the square 
or as some higher power of the voltage. Ohmic currents 
increase linearly with the voltage. One would expect, 
therefore, that for any finite conductivity, there would 
be a range of voltages near zero for which the ohmic 
currents would predominate. For voltages higher 
than some critical voltage, space-charge-limited cur- 
rents would predominate. The critical voltage at which 
this transition from ohmic to space-charge-limited be- 
havior takes place should increase as the normal 
volume-generated conductivity increases. Results of this 
character are clearly reported in reference 9 where the 
critical voltage is varied by shining light on a CdS 
crystal. 

What has just been described should certainly take 
place if the ohmic and space-charge-limited currents 
were in parallel, physically separate paths. When the 
two types of current occupy the same physical volume, 
the transition from one current to the other is likely 
to be somewhat more involved because the potential 
distributions are different for the two types of current. 
There will then be a competition between the two 
processes to establish their appropriate potential dis- 
tribution. It would appear, however, from qualitative 
arguments that the mechanism that introduced the 
larger density of free carriers would control the po- 
tential distribution. Accordingly, higher volume gen- 
erated carrier densities mean that a higher voltage is 
required before the injected space-charge-carrier densi- 
ties predominate and determine the character of the 
current-voltage curve. 

It is interesting that even in the range of voltage 
where the ohmic currents predominate in the steady 
state, the space-charge-limited currents may determine 
the transient behavior. This follows from the fact that 
when the voltage is increased there is a transient high 
density of space-charge-carriers in the conduction band 
—a density that may exceed that of the volume gen- 
erated carriers. As these space-charge-carriers become 
trapped, their density falls below that of the volume- 
generated carriers and the latter lead to steady-state 
ohmic currents. 


XII. TOOL FOR MEASURING CRYSTAL DEFECTS 


As already outlined in an earlier section, the number 
and energy distribution of traps can be deduced from 
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the current-voltage curve for space-charge-limited cur- 
rents. What needs to be emphasized here is that these 
currents constitute an unusual tool for measuring 
defect structure—a tool that becomes particularly 
effective in the range of low concentration of defects. 

The effect of traps is generally to reduce the ob- 
served space-charge-limited currents below their theo- 
retical value for a trap-free crystal. The measure of this 
reduction is the ratio of free to trapped carriers. Thus, 
the observed currents should approach those for a 
perfect crystal when the number of free carriers matches 
or exceeds the number of traps. The more perfect the 
crystal, the lower the field at which this occurs. 

The density of electron charges forced into a crystal 
d centimeters thick, having plane parallel electrodes, 
may be written approximately as 10’(V/d?) electron 
charges/cm® for an assumed dielectric constant of 10. 
This means that for a millimeter thick crystal having 
10" traps/cm*, the space-charge-limited currents should 
approach their theoretical values at ten volts. At this 
voltage the current will be about 10 microamperes/cm? 
for a mobility of 100 cm?/volt-sec. The measurement of 
trap densities of only one part in 10'° becomes then a 
simple current-voltage measurement at low voltages 
and easily measurable currents. 


XIII. CONCERNING BREAKDOWN IN INSULATORS 


When the voltage across an insulator is increased 
steadily the power dissipation in the insulator is finally 
increased to the point where the insulator ‘“‘burns out” 
or is said to “break down.” If enough carriers are nor- 
mally present in the insulator, the breakdown is a rela- 
tively slow and gradual process in which the increased 
voltage at first leads to an increased temperature. More 
carriers are generated at the higher temperature and the 
approach to breakdown becomes more rapid. This 
process is known as “thermal breakdown” and may be 
followed reversibly to values close to actual breakdown. 
A second process” that has received the major share of 
theoretical attention is a fast, electronic process known 

‘Gntrinsic breakdown.” Here a critical electric field 
may be observed at which the carrier density is pre- 
cipitously increased by a collision ionization and re- 
sulting avalanching process or by field emission from 
the filled band. Even though the actual breakdown 
field has a sharply defined value, there is reason to 
expect in this model also that the prebreakdown cur- 
rents will increase faster than linearly with voltage. 

The present discussion adds a third mechanism for 
increasing the carrier density in insulators and must be 
considered in analyzing breakdown data. This mecha- 
nism of space-charge-limited currents becomes more 
significant as the crystallinity of the insulator improves. 
It may be distinguished from intrinsic breakdown by 
the fact that the breakdown field should increase ap- 
proximately linearly with electrode spacing. 


“H. Fréhlich and J. H. Simpson, Advances in Electronics 
(Academic Press, New York, 1950), Vol. 2, p. 185. 
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A rough estimate of the contribution of space-charge- 
limited currents may be made as follows. Let the in- 
trinsic breakdown field strength be known. From this 
value of field and the known geometry of the specimen 
a value for the space-charge density in the insulator 
may be computed. This value of space charge density, 
converted to carrier density, must be comparable with 
the trap density in order that space-charge-limited 
currents be significant. For example, at a field of 10° 
volts/cm in an insulator 10-* cm thick, the number of 
electron charges per cm forced into the insulator would 
be 10'®. Trap densities less than this value would allow 
space-charge-limited currents to be significant; trap 
densities greater than this value would tend to suppress 
the space-charge-limited currents. 


APPENDIX 


I. CURRENT-VOLTAGE CURVE FOR EXPONENTIAL 
TRAP DISTRIBUTIONS 


The trap density per unit energy range is defined by 
(30) 


ny= Ae Elke, 


where E is the energy measured from the bottom of the 
conduction band and 7, is a characteristic temperature 
greater than the temperature at which the currents 
are measured. The condensed charge forced into the 


insulator is 
Q=VC. (31) 


This condensed charge raises the Fermi level by an 
amount AE defined by the relation 


a VC 
f ndE= BBs ; 
E 


f—AE é é 


(32) 


(33) 


wad vc 
f Ae-Elk?ed E=—, 
Ef—AE é 


The solution of Eq. (33), neglecting the upper limit 
of integration, is of the form 
AE=kT.(K+InV), (34) 


where K contains the temperature but not the voltage. 
The ratio of free to trapped charge is [see Eq. (11) ] 


6=en-ce4#!*T/VC. (35) 
If Eq. (34) is used for AZ, 

6=constant exp[ (7./T) InV]/V (36) 
=constant V (7e/7)—1, (37) 
This value for @ is now inserted in Eq. (9) to give 


Ta VerelTH1, (38) 
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Il. FIELD AND CHARGE DISTRIBUTION 
BETWEEN CATHODE AND ANODE 
A solution is sought for the usual pair of equations 
for one-dimensional space-charge-limited flow: 


_ <. (39) 


dx 


T=pE, (40) 


subject to the boundary condition E=0 at «=O. p is 
the space charge density and is composed of a part, p,, 
in the conduction band and a part, p;, in traps. Let 
psp; so that it may be neglected in Eq. (39). Also, 
from Eq. (37) let 

ps= Api", (41) 
where 

n=T./T. (42) 


Equation (39) may be rewritten, using Eqs. (40) 
and (41), in the form 


dE 4xfps\"" 407 I \ 
Sy (Laem 
dx k\A k \pEA 


where B= (4x/k)(I/A)'!". 
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The solution of Eq. (43) satisfying the boundary 
condition is 


E=([(n+1)/n Bur! (+), (44) 


As n—~, E— Bx. If n=1, the usual form for a trap- 
free (as well as a shallow trap) model is obtained, 
namely E« x}, 

The distribution of trapped space charge using Eq. 
(39) is 


k dE 


‘dai 


4r dx 
k n+1 


a (45) 
T 


Again this reduces to the familiar x—} form when n=1, 
but approaches a constant for large n. 

The uniform distribution of space charge, at large n, 
means that when the electrodes are shorted only a 
vanishingly small net current will flow as the space 
charge leaves the insulator. The space charge will flow 
out almost symmetrically at both ends of the insulator. 
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Equation of State of Metals from Shock Wave Measurements* 


Joun M. WatsH AND RussELL H. CuristiANt 
Los Alamos Scientific Laboratory, Los Alamos, New Mexico 


(Received October 21, 1954) 


Shock wave pressure magnitudes from about 150 to 500 kilobars have been attained for metals by using 
high explosives. A photographic technique for the nearly simultaneous determination of shock and free 
surface velocities is presented, and measurements for aluminum, copper, and zinc are given. 

Expressions are derived which permit the calculation of pressure-compression points from measured 
velocity pairs. Consequent Hugoniot curves are presented, probable errors for which are 1 to 2 percent in 
compression for a given pressure. Finally, the known Hugoniot curves are employed in a calculation which 


determines temperatures and isotherms. 


I. INTRODUCTION 


gine a detonation wave interacts with an 
explosive-metal interface, a compression wave 
is transmitted into the metal. In the ordinary case this 
disturbance is a shock wave separating a compressed 
state from the undisturbed metal. The pressures at- 
tained behind such shock waves are typically in the 
range 150 to 500 kilobars (1 kilobar=10° dynes/cm? 
= 986.9 atmospheres). The associated problem of deter- 
mining pressure-compression data from shock wave 

* Work done under the auspices of the U. S. Atomic Energy 
Commission. Papers on this subject were nen by the authors 
at the July, 1953 meeting of the Fluid Dynamics Section of the 
American Physical Society at State College, Pennsylvania and 
at the 1954 annual meeting of the American Physical Society. 

t Now at the University of California Radiation Laboratory, 
Livermore, California, 


measurements is the subject of the present investi- 
gation. Such data serve to supplement and extend the 
wealth of static pressure-compression data which exist 
for pressures up to 100 kilobars.' 

Two basic assumptions are employed throughout the 
present considerations. First, since shock pressures are 
several hundred times yield points of the materials 
involved, an ordinary “fluid” type equation of state is 
assumed, i.e., a functional relationship (unspecified) 
between P, V, and T is assumed to be an adequate 
representation of the metal. This assumption precludes 
the explicit treatment of effects arising from the 
material rigidity which, however, are felt to play 4 


1See P. W. Bridgman, Revs. Modern Phys. 18, 1-93 (1946) 
for a general review. 
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STATE OF METALS FROM SHOCK 


negligible role in the description of states in the present 
pressure range. 

Second, thermodynamic equilibrium is assumed for 
the calculation of states behind the shock front. 
Specifically, it is assumed that in a time of 10-7 second 
or less thermodynamic equilibrium is essentially com- 
plete.? 10~” second corresponds to a shock propagation 
distance of a few tenths of a millimeter, so that the 
condition is equivalent to assuming the shock front 
thickness is a few tenths of a millimeter or less. Appli- 
cation of the Rankine-Hugoniot equations is exact, of 
course, regardless of the thickness of the shock front. 
The limitation is imposed because of the finite scale of 
experimentation: shocks were measured after propaga- 
tion distances as small as ten millimeters and free 
surface velocities were measured in the first two or 
three millimeters of travel. For such geometries the 
10-7-second figure is sufficiently small to assure that 
transient phenomena do not affect the measurements 
so that measured velocities transform to describe an 
equilibrium state behind the shock wave. An equilibrium 
time greater than about 10~* second, on the other hand, 
could cause transient phenomena which would not be 
detected experimentally, i.e., measured velocities would 
transform to describe a quasi-equilibrium state behind 
the lead part of the shock front. It seems probable that 
polymorphic transitions, such as Professor Bridgman 
has observed for various substances, could in some 
instances require times in excess of 10-* second. No 
such transitions have been observed in static experi- 
mentation for the substances reported herein and it is 
assumed that none occur. In the above sense, however, 
present shock-determined equation-of-state data are 
dynamic data which would differ from corresponding 
static equation of state where equilibrium times in 
excess of about 10~® second are involved. 

Experimental data consist of the accurate measure- 
ment of two velocities associated with the shock wave; 
these are the velocity of the wave as it approaches the 
free surface of the metal plate (shock velocity) and the 
initial velocity of the plate free surface when the shock 
is reflected at this surface as a rarefaction wave (free- 
surface velocity). Figure 1 illustrates these velocities 
and certain symbols to be used below. A method for 
such measurements and data for a series of experiments 
on aluminum, copper, and zinc are given as Sec. II. 

Transformation of the measured velocity pairs to 
pressure-compression points is immediate (a straight- 
forward application of the first two Rankine-Hugoniot 
equations) if one approximates the particle velocity 


* This interval, the transit time for a particle from an initial 
state of essentially complete thermodynamic equilibrium ahead 
of the shock to a final state of “essentially complete” thermo- 
dynamic equilibrium behind the shock, is necessarily an approxi- 
mate concept. Its ultimate refinement is contingent on the 
structure of the shock front (in particular, the manner of approach 
to thermodynamic continent tame which definitions of “essen- 
tially complete” could be made. Even so the concept is a useful 
one which permits classification as to the order of magnitude of 
the times involved. 
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Fic. 1. Schematic illustration of pressure profiles at successive 
times. Symbols employed are P, V(=1/p) for pressure and 
specific volume; U,, Up, U;, Uys, for shock velocity, shock- 
particle velocity, particle velocity due to the rarefaction wave, 
and free-surface velocity. 


behind the shock as one-half the measured free surface 
velocity. Section III is therefore devoted to establishing 
the validity of this approximation, where expressions 
for maximum possible errors associated with its use are 
derived. Such errors, as applied to present data, are 
always less than one percent in compression at a given 
pressure. 

The relatively small effect of temperature upon metal 
compressibility allows the precise determination of 
related pressure-compression curves, (e.g., adiabats and 
isotherms neighboring the Hugoniot) by the application 
of small temperature perturbations. This procedure is 
given in Sec. IV, where the second TdS equation of 
thermodynamics and assumptions of constant specific 
heat (C,) and (0P/87)y are used to formulate a basis 
for calculation. 


II. EXPERIMENTATION 


A typical arrangement used for the determination of 
velocities is illustrated as Figs. 2(a)-(c). The high- 
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Fic. 2. (a) High explosive system, metal plate, and end view 
of the Lucite assembly on the free surface of the metal plate. 
(b) Cutaway side view showing the position of the central Lucite 
block. (c) Camera view of the assembled shot. The sweep fiducials, 
end views of which are shown here, are aluminum tubes, approxi- 
mately } in. o.d., $ in. id. by 5 in. long. The far ends of the latter 
are imbedded about } in. in the metal plate; the near ends are 
pinched to define a small (0.02 in.) aperature. 

The slit system consists of a 0.030 in. thick dural plate, into 
which are machined the 0.040 in. wide slits. The slit system is 
mounted 1.5 in. in front of, and parallel to, the main plate. The 
central slit is aligned to view the central Lucite block; the two 
slits to either side of the central slit are over the side Lucite 
blocks. 
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explosive assembly consists of a plane wave lens,’ 
followed by a block of high explosive. This assembly 
induces a strong shock into the machined metal plate, 
one surface of which is in contact with the high explo- 
sive. 

Shock and free-surface velocities are recorded by a 
high-speed sweep camera.‘ Shock velocity is measured 
as near as possible to the free surface of the plate to 
minimize effects of shock deceleration, and free-surface 
velocity is determined for the first few millimeters of 
run. These measurements are accomplished with an 
assembly on the free surface of the metal plate, of 
which the following is typical: A flat bottom groove, 
0.5 in. wide, is milled near the center of the free surface 
at an angle of 10° to the surface. A Lucite block 3 in. 
X1 in.X5 in. is accurately placed in the groove parallel 
to, but 0.010 in. away from the bottom surface. The 
lower surface of the Lucite is machined and polished 
flat, and that portion of the lower surface which 
protrudes from the groove (about half) is covered by a 
0.020 in. dural plate which is spaced 0.010 in. away 
from the Lucite surface. The pieces are glued together 
with small shims maintaining the 0.010 in. gaps. This 
state of assembly is illustrated as Fig. 2(b). Two 
additional Lucite blocks (1 in.X1 in.X5 in.) are simi- 
larly fitted alongside the groove, parallel to the free 
surface of the metal plate but 0.010 in. away from it 
[see Fig. 2(a) ]. 

As the shock wave reaches the 0.010-in. gaps at the 
groove bottom and the plate free surface, it causes the 
metal to move to the Lucite. This process, in turn, 
causes multiple shock reflections of the gas within the 
gaps and luminosity results. Air shocks have proved 
sufficiently luminous in certain experiments, while 
others (where the metal free surface velocity is low) 
require the introduction of argon gas in the 0.010-in. 
gaps. The Lucite surface, under attack by the gas shocks 
and moving metal surface, quickly becomes opaque, 
providing a sharp shutter action for the light. 

A high-speed moving-image camera views this light 
through a slit system [Fig. 2(c) ], sweeping the image 
in a direction perpendicular to the slits. The extinction 
of light due to the opacity of the Lucite is useful here 
since it prevents the record from one slit being swept 
over that for lower slits. 

Traces from all but the central slit merely record 
times of arrival of the shock wave at various positions 
on the free surface of the metal; such information must 
be incorporated in the record to correct for yaw and 
curvature of the “plane” wave. For the central trace, 
that half corresponding to the groove in the plate gives 

’ These plane wave generators are a lens-type combination of 
two explosives with slow and fast detonation velocities, such that 
point initiation is converted into a plane detonation wave. See 
J. H. Cook, Research (London) 1, 474 (1948). 

4 The sweep camera used in present investigations is an f=11 
synchronous rotating-mirror camera capable of writing speeds as 
high as 3.2 millimeters on the film per microsecond. Photographic 


records are taken on glass age (4 in.X5 in.) to minimize 
uncertainties associated with film shrinkage. 
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a record of shock velocity as the shock wave approaches 
the free surface. The other half of this trace is caused 
by the collision of the plate with the Lucite assembly 
protruding from the groove; i.e., by closing the 0.010-in. 
gap defined by the bottom surface of the Lucite and 
the 0.020-in. aluminum. This half of the trace consti- 
tutes a record of free-surface motion. The primary 
purpose of the 0.020-in. aluminum cover plate is to 
prevent light from the relatively weak air shocks ahead 
of the metal free surface from writing on the record. 
It is possible, however, to eliminate the 0.020-in. plate 
and measure the trailing edge of the resultant film trace. 
Velocities so obtained agree within experimental error 
with those measured by the present method. 

The direction of increasing time is conveniently and 
accurately fixed on the record by two long tubes [see 
Fig. 2(c)] embedded (about 3 in.) in the metal plate 
above the Lucite. Brilliant jets within the tubes, viewed 
by the camera through small holes in the tube ends, 
register sweep direction on the sides of the film. Finally, 
the images on the record of small metal strips placed 
across the slits (at known spacing) serve to fix the 
scale (or magnification number) for the record. 

The assembly is mounted for firing on a wooden base, 
and plywood shields (defining a small window through 


TABLE I. Experimental data. 








Free- 
Shock — surface 
velocity, velocity, 
Explosive Plate Uz 
block thickness (mm/ 
Material (in.) (in.) psec) 


24 ST aluminum 
po =2.785 g/cm? 





2 (thick) Baratol 


a=72 X10-8/°C 

Cp =0,22 cal/g°C 

Cr =0.22 cal/g°C 

(@P/8T)v =50.6 X10 
dynes/cem?°C 


2S aluminum 

po =2.706 

a=72 X10-8/°C 

Cp =0.22 cal/g°C 

Ce=0.22 cal/g°C 

(0P/8T)y =50.6 X10 
dynes/cem?°C 

Zinc 

oo=7.14 g/cm? 

a=110 X10-6/°C 

Cp =0.095 cal/g°C 

C+=0,095 cal/g°C 

(@P/8T)y =70 X10 


Copper 


t 4 
.093 cal/g°C 
(@P/aT)y =75.5 X108 
dynes/em?°C 


2 (thick) Baratol 
4 (thick) Baratol 
2 Composition B 
2 Composition B 
2 Composition B 
2 Composition B 
4 Composition B 
4 Composition B 
4 Composition B 
6 Composition B 
6 Composition B 
8 Composition B 
12 Composition B 


4 (thick) Baratol 
2 Composition B 
4 Composition B 
8 Composition B 


4 (thick) Baratol 
2 (thick) Baratol 
2 Composition B 
4 Composition B 
6 Composition B 
6 Composition B 
12 Composition B 
8 Composition B 


2 (thick) Baratol 
4 (thick) Baratol 
4 TNT 


4 TNT 


2 Composition B 
2 Composition B 
4 Comrosition B 
3 Composition B 
3 Composition B 
6 Composition B 
12 Composition B 
12 Composition B 
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Fic. 3. A photographic record. Vertical side streaks record the 
direction of increasing time. The four horizontal traces record 
relative times of arrival of the shock wave at the plate free surface. 
Slanting lines are velocity traces for the shock and free surface 
motions. Breaks in the reference lines correspond to positions of 
magnification tapes. 


which the camera views the slit plate and the sweep 
fiducials) are incorporated to protect the record from 
stray light for the ten or so microseconds until the 
changing field of view of the sweep camera no longer 
coincides with the exploding assembly. The camera and 
its operators are in an underground chamber some 15 
feet away. 

A photographic record is given as Fig. 3, where 
pertinent features are identified. 

The analysis of the record is accomplished with a 
Gaertner comparator which measures to one micron in 
both horizontal and vertical directions. First, the film 
is accurately aligned so that the vertical direction of 
cross-hair travel corresponds to the sweep direction 
(determined by the side streaks). Readings are made 
by recording the vertical scale positions of all lines 
(including the shock or free surface line) for each of a 
series of horizontal scale readings. Reference slit read- 
ings for each vertical line are then interpolated to 
determine the corresponding value for the central slit. 
The differences between each of these values and the 
corresponding value of the shock velocity line (or free 
surface velocity line) then constitute data for distance 
(along the central slit) versus time plots, vertical differ- 
ences being readily converted to times by knowledge of 
camera writing speed. Each point on these plots is 
defined by measurements which affect only that point, 
so that the method of least squares is appropriately 
used to determine the best (straight line) fits. Velocities 
corresponding to the slopes are then the phase velocities 
of interception of the shock wave with the 10° groove 
and the free surface with the 10° wedge, which can be 
readily converted to true shock and free surface 
velocities. 

Experimental data are listed in Table I and presented 
graphically as Fig. 4. Probable errors for individual 
velocity determinations are of the order of 0.5 percent. 
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Fic. 4. Experimental data. See also Table I. 


Most of this error (about 0.4 percent) is ascribed to 
shot assembly uncertainties. Tolerances in this phase of 
the work are accordingly low, all measurements in the 
Lucite-metal assembly being made to 0.0001 in. Uncer- 
tainty in the analysis is 0.1 percent to 0.5 percent, 
depending upon record quality. Assembly and analysis 
measurements are made individually for each experi- 
ment, so that the resultant errors should be random 
and represented by scatter in the final data. Camera 
writing speed, however, is applied in the determination 
of all velocities. This parameter has been measured to 
better than 0.2 percent and is therefore believed to be 
eliminated as the only possible source of significant 
consistent error in the determination of velocities. 


Ill. DETERMINATION OF HUGONIOT CURVES 
FROM EXPERIMENTAL DATA 


A. Basic Equations® 
The conservation equations for a shock wave are 
poU .=pi(U,—U,), Mass (1) 
PotpoU 2=Pit+pi(U.—U,;,)’, Momentum (2) 
E,\—Eo=3(P1t+Po)(Vo—Vi), Energy (3) 
~ 6 Equations (1)-(7) are derived in any treatise on shock-wave 


hydrodynamics. See, for example, R. Courant and K. O. 
Friederichs, Supersonic Flow and Shock Waves (Interscience 


which, together with their immediate consequences, 
Vi/Vo= (U.—U,)/U,, 
Pi=poU Upt+Po, 
U,=((Pi—Po)(Vo— Vi) It, 


will be used below. 

Similarly, from the governing equations for con- 
tinuous adiabatic flow (mass and momentum conser- 
vations), the expression 


md oe 


for the particle velocity due to a centered simple 
rarefaction wave can be derived. The subscript “adi” is 
used to denote integration along a line of constant 
entropy. 

In the present problem the material velocity behind 
the rarefaction wave (this velocity corresponds to the 
measurable free surface velocity, U;., referred to the 
laboratory system of coordinates) is the sum of that 
due to the shock and that due to the rarefaction wave, 


Publications, New York, “or ee Eq. (7), in particular, ca? 
be obtained from their Eq. (34.05). 


a] i (7) 
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i.e.) 
Uy;,,=U,t+U,. 


The approximate relation, 
U,/U,=1, (9) 


combined with Eqs. (8), (4), and (5) suffices to deter- 
mine a P;, V; point per set (po,U.,U;.) of experimental 
data. The investigation of this approximation consti- 
tutes the next two subsections, where expressions for 
extreme possible ratios are developed. 


(8) 


B. Maximum Possible Value of U,/U, 


Consider a substance which is shocked from the 
initial state (Po,Vo) to some point (P:,V1) on the 
Hugoniot curve (see Fig. 5) and then relieved adia- 
batically to a point (Po’Vo’). The gain in specific 
internal energy due to the shock is given by Eq. (3), 
while the loss upon adiabatic expansion is f"[—PdV ]aai 
evaluated along the adiabat. The difference between 
final and initial internal energy is therefore 


a Pi+Po 
~~ 
2 


(Vo-— v)-f [ Pdv Jnai. (10) 


Initial and final pressures in the present applications 
(Po is the pressure of the atmosphere, P9’ is the pressure 
of the air shock ahead of the moving metal free surface) 
are essentially zero. For this condition values of the 
specific heat (cy) and the thermal coefficient of volume 
expansion (a) are available from handbook tabulations. 
Thus 


Ey’ — Ex=C p(T’ — To) = (C,p/aVo)(Vo'— Vo). (11) 


Equating (10) and (11) and substituting the identities 


Pi 
Vo'= vi-f 
9 ldP 


Vo’ Pi 


[PdV Jaai= 


V1 0 


[VdP ]ai—PiVi1, 


then yields 


Pi dV Px(VotV3) 
| (v-e—) ae] =————+8(Vo— Vi), 
adi 2 


0 dP 
(12) 
CG 


s=—,, 


for an expression of the fact that the total change in 
specific internal energy is zero around a closed cycle 
(in this case 010/0, Fig. 5). 

We now seek, employing known results from the 
calculus of variations, that adiabat which produces an 
extreme value of U,. Specifically, we seek that curve 
V(P) between the point P;, V; and the line P=0 (but 








Fic. 5. P—V plot showing relative positions of the Hugoniot 
and a pressure-release adiabat. 


otherwise arbitrary) which produces an extremum of 
the right side of Eq. (7) and also satisfies the accessory 
condition defined by Eq. (12). 

The linear combination of integrands is® 


dV\+ av 
x-(-—) —\B—+V, 
dP dP 


where X is an undetermined multiplier. The associated 
Euler equation is 


dfiy dV\- 
E(-2)'sp- 
dPt2\ dP 


Successive integration yields 
(V—b)(P—a+8) = 1/4)’. 
This relation is subject to the three conditions 
OK 


V=Vi —————=9 
a(dV/dP) 


at P=P,, P=P, 


and Eq. (12), which suffice to determine a, 6, and X. 
The resulting P—V curve is the hyperbola 


Poy vy POOPY 
“Pte Inf (Pi+6)/8] 


and the associated extremum of U,, from Eq. (7), is 


P; Pi+67} 
2 6 s. 
6 Notation used is that of H. Marganau and G. M. Murphy, 


The Mathematics of Chemistry and Physics (D. Van Nostrand 
Company, New York, 1943), Chap. VI. 





(13) 
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Fic. 6. Maximum and minimum possible U,/U, ratios. 


This extremum is clearly a maximum since a straight 
line adiabat from P:, V:, to P=0, Vo, in particular, 
satisfies Eq. (12) and yields a smaller U,. The maximum 


ratio is 
U, P;+28 P:+8\ 7? 
a ig | Co ae Orme, ls 
Uy max 2Pi B 


since U, does not depend upon the adiabat. 

This ratio is unity in the limiting case of zero shock 
pressure and increases monotonely with P;. It is also 
interesting to note that the ratio is independent of V;; 
plots of calculated values versus P, are given in Fig. 6. 
Values are typically about 1.02 and less than 1.04 in 
all cases. 

The determination for an experiment (set of values 
po, Us, Us.) of the corresponding ratio is not immediate 
since P, is not a priori known. The solution is easily 





attained, however, by a simple process of successive 
approximations. Unity is assumed for the ratio in order 
to determine a corresponding pressure through Eqs. (5) 
and (8). This pressure yields a refined value of the ratio 


through Eq. (15). This value of the ratio is used tof 
redetermine a pressure, etc. In practice the first- i 


calculated value of the ratio proves sufficiently precise, 
being accurate to four significant figures in present 
applications. 


C. Minimum Possible Value of U,/U, 


A minimum for the U,/U, ratio can also be estab- 
lished. To this end the following general conditions are 
imposed upon possible adiabats, V(P), connecting the 
state P,, V; behind the shock, and the line P=0: 
dV/dP <0, 

&V/dP?>0, (17) 

V(P)>Vu(P). (18) 

The third condition follows from the more basi 
equation of state condition dV(P,S)/aS>0 and the 


well-established fact’? that entropy increases with pres 
sure along the Hugoniot curve Vxz(P). 





(16) 


7See, for example, Courant and Friedrichs, reference 5, pP. 


141-146, where proof is obtained using only conditions equivalent 
to those enumerated above (i.e., the Bethe-Wey] conditions). 
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The minimum possible value of U, compatible with 
these restrictions corresponds to an adiabat which 
coincides everywhere with the Hugoniot curve. To 
prove this, it is sufficient to show that any adiabat 
V;(P) which does not everywhere coincide with the 
Hugoniot curve does not yield the minimum value of 
U,, i.e., can be replaced by another curve V(P) for 
which the associated U, is smaller: Consider two points 
P;, P2(P3>P2) on some segment of the curve V7(P) 
which does not coincide with the Hugoniot (i.e., Vr> Va 
for P;>P> Ps). Replace the segment of V;(P) between 
r 3 and P 2 by 


V(P)=Vi(P)—L(Ps—P)(P—P2) P, (19) 


where € is sufficiently small that conditions (16)-(18) 
are not violated. The contribution to the integral in 
Eq. (7) for this new orgnent is 


ee a Mls 


3 
+2¢(P—P2)(P;—P) (Pet P.—27)| dP 


The derivative with respect to « may be written: 


a(AU,) ‘. ee ere 
de 4 (P2+P3) 





[—dV/dP}! 





MP2tPs) (P— P.)(P3—P)(P2Ps—2P 
F f ( )( )( ) 4P. (20) 
P. 


2 [—dV/dP} 


Both integrals in the last equation are positive. Substi- 
tuting P= P.+P;—Z in the second we see that it is 


madentical to the first except that the demominator is 
mreater, by condition (17). Hence 


0(AU,)/de<0 


so that the new segment causes a decrease in AU, and 
ence a lower value of U,. Thus no curve V;(P) which 
loes not everywhere coincide with the Hugoniot yields 
he minimum value for U,. It follows that the desired 
minimum is obtained by integration along the Hugoniot 


curve 


(21) 


wom (2), 


so that the minimum ratio is 


(U,/Up) min= f ‘[(-3) wy) / 


LPi(Vi-—Vo) }}. 


The subscript “Hug” denotes line integration along 
the Hugoniot curve. 

The determination from experimental data (p) and the 
U, versus Us, curve) of the corresponding (U,/U 5) min 
ratios is again most easily accomplished by successive 
approximations: A “first order” Hugoniot curve is 
defined by the unity approximation and Eqs. (4), (5), 
and (8), which curve then permits the calculation of 
minimum possible ratios (for that curve) from Eq. (22). 
These ratios are, in turn, employed to redetermine 
associated curves, etc., a process which can be continued 
until self-consistency is attained, i.e., until the com- 
puted curve regenerates the same ratios which deter- 
mined its locus in the previous cycle. This process, as 
applied to present data, is also rapidly convergent, the 
variation in the calculated ratios being negligible after 
the first calculation. 

Values for the (U,/Up) min are given in Fig. 6 where 
they are plotted versus shock pressure. These values, 
typically about 0.98, are greater than 0.96 throughout 
the experimental range. 


D. Results 


The extrema for U,/U, determine associated extreme 
possible positions for the Hugoniot curve. Comparisons 
of these extremes and the curve obtained by the free- 
surface velocity approximation are made graphically 
as Fig. 7. Neither extreme possible position disagrees 
with the approximate curve (which lies about half-way 
between the two) by more than 1 percent in compression 
(AV/Vo) for a given pressure, 0.5 percent being more 
typical of all data. These uncertainties are sufficiently 
small to justify the approximation as applied to present 
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Fic. 7. Maximum possible errors in AV associated with the approximation (U,/Up)=1. 
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Fic. 8. Hugoniot curves. See also Table II. 


data. Resulting data points which determine the 
Hugoniot curves for aluminum, copper, and zinc are 
listed as Table II and plotted as Fig. 8. 

Over-all precision for experimental Hugoniot points 
are determined by the accuracy of the above approxi- 


TABLE II. Pressure relative-volume points. 








Material P (kilobars) V/V» P (kilobars) V/Vo 





0.804 
0.791 
0.790 
0.791 
0.789 
0.784 
0.780 


288.7 
318.9 
326.6 
323.3 
328.4 
340.2 
347.0 


133.9 
134.1 
136.8 
218.0 
258.6 
253.2 
254.9 
282.9 


141.3 
221.3 


193.0 
169.1 


24 ST aluminum 


295.0 
333.3 


348.7 
383.5 
401.0 
411.5 


356.1 
385.0 
389.5 
402.3 
462.0 
465.4 


2S aluminum 


Zinc 








mation, the precision of the velocity measurements 
reported in Sec. II, and the transformation equations 
to the P—V plane. Results may be conveniently sun- 
marized as estimated probable errors in compression 
for a given pressure, as applied to the curves. These 
over-all uncertainties are estimated at 1 percent for the 
24 ST-aluminum curve and 2 percent for the copper 
and zinc curves. 


IV. TEMPERATURE CALCULATIONS. DETERMINA- 
TION OF ISOTHERMS 


The thermodynamic identity 


TdV (23) 


v 


oP 
TdS= C,dT+ (—) 
oT 


permits, for known C, and (@P/d7)y, the calculation 
of temperatures along Hugoniot curves and adiabats. 
For present applications, both C, and (dP/dT)y ar 
assumed constant. These assumptions, suggested by 
the insensitivity of both parameters to pressure andi 
temperature in static compressibility work, are pre 
sumed adequate to permit the approximate calculation 
of temperatures and the small offsets between P—) 
curves of different temperatures. For present purposés, 
it is interesting to calculate the temperatures along the 
Hugoniot curve, the final temperature after the material 
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Fic. 9. Temperatures and isothermal compressions for aluminum. Numbers along Hugoniot curve give 
temperature rise (°C) associated with a shock of corresponding magnitude. Neighboring numbers in parentheses 
correspond to temperature rise associated with the combined shock and rarefaction processes. An initial temper- 
ature of 27°C and the thermodynamic data listed in Table I were assumed. The point marked (?) was also 
reported by Professor Bridgman who expressed greater credence for the curve drawn. 


has been relieved adiabatically to zero pressure, and f P, 
S 


Po M1 
the P—V locus of one (the 27°C) isotherm. Determi- [TdS]uug= (Vo- vi+f [PdV Juus. 

ation of this isotherm involves sufficiently small 2 Vo 
«mperature perturbations that continuity comparisons 
between statically determined isotherms and _ those 
‘ported here can be made. d ps dP; (Vo—V:1) Pi 
Combining Eq. (3) and the first law of thermo- —f [TdS)uu.=————_+— 
lynamics we get dV 1~ 8 dvi 2 2 


0 


Differentiation of the latter gives 


’ 
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Fic. 10. Temperatures and isothermal compressions for copper. 


which, for a given Hugoniot curve, may be written as_ where 
a function f(V:). Combining this result with Eq. (23) 


oP 1 dP i 
we get b= (—) [Cw f(V)=- —(Vo- V)+-P, 
oT/ y 2dV 2 


d rs aT, 
av, a UPS joan Ce St (5) Ta! (Vi), and the condition T=T) at V=Vo was imposed. _ 
: The similar expression for temperature variatio! 


the solution of which is along an adiabat is 
var f(V)eb” TV) =TebV-, (25 
|-——av] , (24) teas se .; 
See where 7; and V; are initial conditions at some point 0! 


v 


Ti(Vi) = Toenveto eon f 


Vo 
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Fic. 11. Temperatures and isothermal compressions for zinc. 


the adiabat. This relation follows immediately from 
Eq. (23). For two points of different T and the same V, 
the pressure offset is given by 


oP 
ap| = (—) “ATI . 
Vv oT V V 
Also, for the P=0 isobar 
(V—Vo)p—0= Voa(T—To) po 


relates temperature and specific volume. TV» refers to 


(26) 


(27) 


normal specific volume and temperature while a is an 
average value of the thermal coefficient of volume 
expansion, as before. 

Equation (24), augmented by the thermodynamic 
parameters listed in Table I, was used in straightforward 
numerical calculations of temperatures along the 
Hugoniot curves. Results of such calculations are 
presented in Figs. 9, 10, and 11. 

Equation (25) was used in a similar determination of 
temperatures along pressure-release adiabats. Of partic- 
ular interest is the final temperature after the material 
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TaBLE III. Calculated adiabats and temperatures. The symbol AT means temperature rise, °C, above an assumed initial temperature 
of 27°C. P is measured in kilobars. The particular adiabat reported for each metal is the one that intersects the Hugoniot curve at 
the high-pressure end of the latter curve. Temperature rises associated with P=0 on the adiabats ( where V/V is slightly greater than 
unity, due to heating) are given as the numbers in parentheses, Figs. 9-11. The determination of additional adiabats (see Sec. IV) 
involves only a very moderate amount of numerical labor. 








Copper Aluminum Zinc 


Hugoniot Adiabat Hugoniot Adiabat Hugoniot Adiabat 


AT P / AT AT e P AT AT P 


194 14.6 232 «11.7 : 0 0 291 204 
12 254 27.2 : 15 
27 276 8644.6 : 32 











299 64.9 . 52 
323 = 88.0 L 76 
348 114.8 108 
374 146.6 151 
401 180.9 211 
430 220.5 296 
459 261.5 406 
490 305.6 521 
522 350 642 
768 


Fh) 
~I ~~ 00 00 co 














is relieved adiabatically to zero pressure, obtained by 
combination of Eq. (25) and Eq. (27). Results of such 
calculations are also given in Figs. 9-11. 

Equation (26) combined with the known P—V locus 
of the Hugoniot curve and the input temperatures 
calculated above then permits the immediate determi- 
nation of isotherms and adiabats neighboring the 
Hugoniot curve. Resulting 27°C isotherms are plotted 
in Figs. 9-11 where they may be compared to results 
of static experimentation.” A similarly-determined 
pressure release adiabat (corresponding to a shock 
strength which is near the strongest attained) is listed 
for each material as Table ITI." 


8 P. W. Bridgman, Proc. Am. Acad. Arts Sci. 76, 55 (1948). 

9 P. W. Bridgman, Proc. Am. Acad. Arts Sci. 77, 187 (1949). 

10 P, W. Bridgman, Phys. Rev. 60, 351 (1941). 

Jt is interesting to compute U,/U, ratios corresponding to 
the data given in Table III, an application of Eqs. (6) and (7). 
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These values, 1.017, 1.011, and 1.026 for aluminum, copper, and 
zinc, are all greater than unity and are fairly close (though 
slightly less, of course, than the corresponding maxima from Sec. 
III.) Similar ratios for a material whose Hugoniot exhibits con- 
siderable curvature, however (e.g., an organic material), could be 
less than unity. Corrections to the Hugoniot curve which are 
indicated by the present ratios (see Sec. IIID) have been neglected. 
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Thermalization of Positrons in Metals 
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It is shown that a thermalization time of 3X 10~” sec follows from the assumption that the interaction 
between a positron and a conduction electron can be approximated by an exponentially screened Coulomb 


potential. 





NOWLEDGE of the thermalization time of a 
positron in a metal is pertinent to the interpre- 
tation of measurements of the angular correlation! ? 
of annihilation radiation and of the time distribution of 
annihilation.* Garwin‘ has pointed out that the estimate 
of 3X10- sec for thermalization in gold, made in 
Appendix I of reference 1, is too long because collisions 
between the positron and the conduction electrons are 
ignored; including this effect he estimated 10~“ sec 
for the thermalization time. Using a method similar to 
that of Sec. 5 of reference 5, we shall remove some of 
the crude approximations underlying Garwin’s estimate 
and show that it is too short. 

The conduction electrons are treated as a free- 
electron gas at the absolute zero of temperature. The 
Exclusion Principle is assumed not to apply to a 
system consisting of a positron and an electron. Transi- 
tions from an initial state consisting of a positron and an 
electron with wave vectors k; and ke, respectively, to a 
final state (k,’,ks’) occur because of the interaction 
-ér— exp(—gr). The number of transitions suffered 
per second by the positron is ¢-'P(t), P(#) being given 
as a multiple integral in Sec. 5.5 To find the energy lost 
per second, we alter the integrand to include a factor 
representing the energy loss in a given transition, 
namely h?(2m)—'(k,?—k;’?). After a number of elemen- 
tary integrations an expression for R, the rate of loss 

of energy by the positron, can be found. We now 
suppress subscripts and write & and E for positron 
wave number and energy; also, for k/g we put x. 


R= (3x) (meth) EF (k/q) 


5 56 6 
F(x) ar (+-=) log(1+-4x”) 
af Ce 


1 5 
+-(18-—) tan!2x% (1) 
2x? 


x 


20 


1 
1——a*+---} for |x|<«-. 
9 2 


'DeBenedetti, Cowan, Konneker, and Primakoff, Phys. Rev. 
77, 205 (1950). 

ie Warren and G. M. Griffiths, Can. J. Phys. 29, 325 
*R. E. Bell and R. L. Graham, Phys. Rev. 90, 644 (1953). 

‘R. L. Garwin, Phys. Rev. 91, 1571 (1953). 

A999) E. Lee-Whiting, Proc. Roy. Soc. (London) A212, 362 


The definition of F(x) given is valid for E<Epr, Er 
being the Fermi energy of the free-electron gas. Note 
that R is independent of Er for E<E,r. A classical 
mechanical computation of R would yield a result 
proportional to the density of free electrons. In the 
wave mechanical problem this proportionality would 
become R« Er}. The difference between the two results 
is caused by the action of the Pauli exclusion principle 
in prohibiting transitions of conduction electrons to 
states already occupied. One would expect to find 
Re« LE, for E>Er. 

Note that R approaches infinity as g goes to zero. 
This limit corresponds to using an unshielded Coulomb 
field as the perturbation and is equivalent to ignoring 
the polarization of the free electrons about any particu- 
lar electron. The divergence of R is similar to the well- 
known divergence of the total cross-section for Coulomb 
scattering. We shall assume that the value of g ap- 
propriate for positron-electron interaction is the same 
as that found suitable for electron-electron interaction. 
There is evidence, both ig see and theoretical, 
that g is roughly equal to 10° cm~. Landsberg® found 
that he could explain the aia -energy tail on the L 
emission line of metallic sodium with a value of q of 
1.21108 cm~. For values of g of roughly the same 
size Wolhfarth’ was able to get agreement between 
calculated and measured electronic specific heats. 
Although the method of calculating g used by Lee- 
Whiting® for a fast electron travelling in a free-electron 
gas is not strictly applicable to the present problem, 
it does give a value of g for sodium in good agreement 
with Landsberg’s empirical estimate. More recently 
Bohm and Pines® have shown that the short-range 
interaction between two members of a free-electron 
gas can indeed be approximated by the exponentially 
screened Coulomb potential. The long-range inter- 
action is expressed in terms of oscillations of the whole 
gas, the quantum energy for which is too large to 
permit them to be important in the present problem. 
Bohm and Pines give them to be important in the 
present problem. Bohm and Pines give the screening 
parameter in the form 6k, kr being the wave number 
of an electron on the Fermi surface. They find that 8 
lies between 0.5 and 0.75 for metals, sodium having 
the value 0.68. We take for convenience g=1.02X 10° 

6 P. T. Landsberg, Proc. Phys. Soc. Feed) A62, 806 (1949). 


7E. P. Wohlfarth, Phil. Mag. 41, 534 (19 
8D. Bohm and D. Pines, Phys. Rev. 92, 609 (1953). 
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cm-!, this being the value of kr corresponding to a 
Fermi energy of 4 ev. Because the dependence of R 
on q is given roughly by R« q~, the uncertainty in the 
choice of g does lead to considerable uncertainty in R. 
Comparison of the theoretical and empirical values of 
q for sodium enables one to say that, for sodium at 
least, our estimate of R is not in error by more than a 
factor of 5. 

By an approximate integration of (1) it is easily 
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shown that the positron energy falls from 4 ev to 1 ev 
in about 3X10-* sec, from 1 ev to 0.1 ev in about 
2X10-" sec and from 0.1 ev to 0.025 ev in about 
3X10-” sec. Since positrons are observed* to annihilate 
in metals with a lifetime of about 10~-” sec, most of 
them must be thermalized before annihilation. 

The incompatibility of the fundamental assumptions 
of the time-dependent perturbation method found in 
reference 5 does not occur in this calculation. 


MARCH 15, 1955 


NUMBER 6 


Electromagnetic Effects of Spin Wave Resonance in Ferromagnetic Metals 


W. S. Ament AND G. T. Rapo 
Naval Research Laboratory, Washington, D.C. 


(Received November 15, 1954) 


It was shown experimentally by Rado and Weertman that under suitable conditions there is an observable 
effect of exchange interactions on the ferromagnetic resonance in metals. The present paper provides an 
electromagnetic theory of this “spin wave resonance” experiment and satisfactorily explains the exchange 
shift as well as the width and shape of the absorption line. A combined solution is obtained of Maxwell’s 
equations and the equation of motion of the magnetization vector M, the latter equation including the 
exchange term due to the nonuniform orientation of M in the skin depth. It is shown that the triple refraction 
caused by the exchange effect necessitates the introduction of new boundary conditions. The final result, 
which is checked numerically and by an approximate calculation, is an expression for the measurable 
surface impedance and the “equivalent isotropic permeability” derived therefrom. This result is discussed 
and generalized, the properties of thermal spin waves in metals are briefly considered, and previous theories 
of exchange effects in ferromagnetic resonance are shown to be inadequate. 


I. INTRODUCTION 


T was recently shown by Rado and Weertman!” 
(to be referred to as RW) that under suitable 
conditions the effects of exchange interactions on the 
ferromagnetic resonance of metals can be observed 
experimentally. Such effects had not been observed 
previously but their physical basis has long been 
known. In the skin depth of a ferromagnetic metal the 
orientation of the magnetization vector M is not 
uniform so that the effective exchange field is not 
parallel to M. Thus there exists an exchange torque 
which is, in principle, capable of modifying the motion 
of M and the nature of the ferromagnetic resonance. 
Following RW, we refer occasionally to such a modified 
ferromagnetic resonance as “spin wave resonance.” 
The available theories of exchange effects in ferro- 
magnetic resonance are inadequate in two respects. 
First, they do not predict satisfactorily under what 
conditions such effects might actually be observable, 
so that RW had to choose their experimental conditions 
largely on the basis of physical considerations. Second, 
these theories do not provide a reliable quantitative 
description of the exchange effects, so that their use 


1G. T. Rado and J. R. Weertman, Phys. Rev. 94, 1386 (1954); 
the symbol yz? appearing in the next to last paragraph of this 
reference is a misprint and should read ye. 

2G. T. Rado and J. R. Weertman (to be published). 


offers at most a qualitative guidance in the interpre- 
tation of the RW experiments. 

The present work, which we reported briefly at an 
earlier date,* is an attempt to eliminate the theoretical 
inadequacies mentioned above by giving a consistent 
description of spin wave resonance on the basis of 
electromagnetic theory. Such a description should make 
it possible to account for the position, width, and shape 
of the resonance line, as well as to evaluate the im- 
portant exchange factor A and the spectroscopic 
splitting factor g from the experimental results. 

Basically, the electromagnetic problem treated in the 
present paper involves a combined solution of the 
equation of motion of M (the so-called “spin wave 
equation”) and Maxwell’s equations, the solution being 
required to satisfy two sets of boundary conditions. 
The first set represents the usual continuity conditions 
on the tangential components of E and H, and the 
second set represents some new conditions that are 
imposed by the (semi-classically described) exchange 
effect on certain derivatives of M. Specifically, we 
consider a ferromagnetic metal, possessing a conduc- 
tivity o and a saturation magnetization M,, which is 
exposed to a saturating static magnetic field of magni- 
tude H, and to a microwave field of circular frequency 
w. The measurable quantity we calculate is the surface 


3 W. S. Ament and G. T. Rado, Phys. Rev. 94, 1411 (1954). 
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ELECTROMAGNETIC EFFECTS OF SPIN WAVE RESONANCE 


impedance Z, but we find it convenient to introduce 
another measurable quantity, called the “equivalent 
isotropic permeability” yequ, and to express Z in terms 
of equ Our final result, which can often be approxi- 
mated by our Eq. (31), is a theoretical expression for 
Hlequ in terms of the known parameters w, o, M,, and 
H,, and the unknown parameters A, g, and A. The 
relaxation frequency A, which is a measure of whatever 
damping mechanism of unknown origin may exist, has 
been included in our calculation in order to permit a 
theoretical comparison of the effects of exchange and 
relaxation on the observed resonance line. However, 
in the experiments of RW the relaxation effects prove 
to be negligibly small, so that in their case there are 
only two unknown parameters, A and g. 

In Sec. II we formulate the problem and carry the 
solution of the differential equations sufficiently far to 
obtain a secular equation for the propagation constant 
k in the metal. Since the existence of the exchange 
torque causes this equation to be cubic rather than 
linear in &?, the calculation of the amplitudes of the 
newly introduced waves, and hence of the value of Z, 
requires the additional boundary conditions mentioned 
above and formulated in Sec. III. Using both sets of 
boundary conditions and an algebraic “bialternant 
method” which circumvents the necessity of actually 
solving the secular equation, we then proceed, in Sec. 
IV, to calculate Z and to derive our explicit but approxi- 
mate analytical formula for wequ. In Sec. V, we discuss 
certain limiting cases of our final result, and briefly 
consider the method of curve matching used for ex- 
tracting A and g from the experimental pequ. We do 
not compare our formula in detail with the measured 
Mequ Since such a comparison is given by RW, but we 
do compare our formula with the results of a rather 
accurate numerical calculation which we performed on 
a digital computer. Next, we briefly discuss various 
generalizations of the result of Sec. IV, including the 
effect of a curved rather than plane ferromagnetic 
sample, the case of oblique rather than normal incidence 
of the microwaves on the sample, and the effect of 
changing the type of damping term used in the equation 
of motion. In Sec. VI, we give a critical discussion of 
the theoretical work of other authors and show that it 
resulted in some incorrect conclusions. 

Since in Sec. IV we omitted the lengthy algebraic 
details of our general solution, we give in Appendix A 
an outline of a simple alternative solution of the 
problem. This method is valid unless the value of H, is 
in the immediate vicinity of a certain specified value. 
The result of this alternative solution verifies our 
approximate formula and provides additional physical 
insight into the problem. Finally, in Appendix B, we 
derive the dispersion law for thermal spin waves from 
our general secular equation, and show that the usual 
dispersion law is not modified in any essential way 
even though Maxwell’s equations and the metallic 
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conductivity have been taken into account. We then 
suggest that in microwave resonance work at not too 
high temperatures the concept of a magnetization 
vector is indeed justified even if exchange effects are 
important, as in the experiments of RW. 


Il. DIFFERENTIAL EQUATIONS 


In the interior of a saturated ferromagnetic metal 
the propagation of microwaves is determined by 
Maxwell’s equations and the equation of motion of M. 
Putting B=H+4z7M, and using Gaussian units, we 
write Maxwell’s equations in the form 


VX E=— (1/c)0(H+4aM)/ot, 
VXH= (4n0/c)E, 


(1a) 
(1b) 


because the high values of o obtaining in metals permit 
us to neglect the displacement current compared to the 
conduction current at microwave frequencies. The 
equation of motion of M, sometimes known as the spin 
wave equation, can be written in the form 


(1/y)8M/at=MX[H-+ (24/M2)VM 


—Q/yM?)MXH], (2) 


where ¥ is given by ge/2mc, and the other quantities 
have been introduced in Sec. I. The quantity of the 
brackets of Eq. (2) is an effective magnetic field and 
contains the following contributions. The first term, 
H, is the actual magnetic field and includes all de- 
magnetizing fields; this term, as well as the effect of 
anisotropy, will be discussed later in this Section. The 
second term, which is proportional to the exchange 
factor (or “exchange stiffness constant”) A, is the 
effective exchange field due to the nonuniformity in 
the orientation of M. The third term, which is propor- 
tional to the relaxation frequency X, is an effective 
field that represents phenomenologically the influence 
of any unknown damping mechanism. It may be noted 
that the H occurring in this particular term ought to be 
replaced by [H+ (24/M,2)V*M ] because the damping 
describes the approach of M to the (otal field. However, 
this correction is easily shown to be a second order 
effect in most cases so that we shall use the simpler 
expression given in Eq. (2). 

Equation (2) was first obtained, in a slightly different 
form, by Landau and Lifshitz‘ and used for the study 
of domain wall motion. Neither their paper nor the 
more recent work explains the physical origin of the 
\ term, and it is not clear whether even the form of this 
term correctly represents all the observed relaxation 
phenomena. As mentioned in Sec. I, we include this 
term primarily to permit comparisons with the exchange 
term, and we postpone to Sec. V a discussion of the 
effect of an alternative (Bloch type) damping term. 
Concerning the nature of the exchange term, however, 


( ong) Landau and E. Lifshitz, Physik. Z. Sowjetunion 8, 153 
1 fs 
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the situation is more satisfactory. Various authors® 
derived this term from the atomic model of a ferro- 
magnet, expressing A in terms of the Weiss molecular 
field coefficient or the Bloch 7?-law coefficient, and 
Herring* estimated A on the basis of the energy band 
model of ferromagnetism. Since all the theoretical 
treatments of the A-term involve several uncertainties, 
it is well to realize that the form of this term follows 
from symmetry considerations and that the magnitude 
of A can be obtained from suitable experiments. It is 
for this latter purpose, of course, that the spin wave 
resonance experiment of RW and the present electro- 
magnetic calculation were undertaken. 

Returning to Eqs. (1) and (2), we now solve them 
for the case of a ferromagnetic metal in the form of a 
plane sample parallel to the xz plane, the air-metal 
boundary being at y=0. The static magnetic field H, 
is taken to be along the z axis, because for reasons 
discussed by RW the case of a static field normal to the 
plane of the sample is not well suited for detecting 
exchange effects. The applied microwaves are assumed 
to be plane waves normally incident upon the xz plane, 
the tangential component of their magnetic vector 
being along the x axis. Some generalizations of this 
physical situation will be discussed in Sec. V. 

We now decompose the fields into a static component 
and a microwave component, so that 


M=M.i,+m, 
H=4.i,+h, 
E=e, 


(3a) 
(3b) 
(3c) 


where i, is a unit vector along the z axis, and the 
microwave components m, h, and e are understood to 
be proportional to exp(iw/—ky) in the metal. We 
further assume that |m|/M, and |h|/H, are small 
compared to unity. As to demagnetizing effects, a static 
demagnetizing correction due to the shape of the 
sample is assumed to have been applied so that H, is 
the static field inside the sample. Dynamic demagnet- 
izing corrections, on the other hand, must not be 
applied explicitly because they will emerge from the 
solution given below. It should also be noted that in 
certain simple cases the effect of anisotropy can easily 
be taken into account, as is well known.® For a single 
crystal with a direction of easy magnetization along 
the z axis, for example, one simply adds to H, the value 
2|K,|/M,, or 4|K:|/3M., depending on whether the 
first order anisotropy constant K; is positive or nega- 
tive. In the first experiments of RW, of course, this 
problem does not arise because in their case K, is 
approximately zero. 


5 For a recent treatment and references to earlier work see C. 
Kittel, Revs. Modern Phys. 21, 541 (1949); C. Herring and C. 
Kittel, Phys. Rev. 81, 869 (1951). 

®C. Herring, Phys. Rev. 85, 1003 (1952); 87, 60 (1952). See 
also the comments by E. P. Wohlfarth, Proc. Phys. Soc. (London) 
A65, 1053 (1952). 
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Eliminating e from Eqs. (1) in the usual way by 
taking the curl of Eq. (1b), we obtain with the use of 
Eqs. (3) and the above mentioned assumptions 


(18k? /2) hyiy— [1+ (162k?/2) ]h = 4arm, (4) 
or in component form 
[1+ (18k?/2) lhat+4am.= 0, (5a) 
hy+4rm,= 0, (5b) 
[1+ (16k? /2) Jk. +4rm.=0, (5c) 


where hz, hy, hz, and mz, my, mz, are the scalar compo- 
nents of h and m, respectively, i, is a unit vector along 
the y direction, and 

5= (2/2mwo)? (6) 


is the classical skin depth for permeability unity. 
Equation (Sb) is seen to express the Kittel’ demagnet- 
izing effect without the explicit introduction of a 
demagnetizing factor. 

Next we combine Eqs. (2) and (3), obtaining with 
the same assumptions 


(iw /y)m=i,X {M h—[H.— (2Ak/M,) }m} 
eal (A/yM,) (H.m,, yv—M.hz, v)s (7) 


where h,,, and m,,, are the vector components of h 
and m, respectively, in the x, y plane. In component 
form, Eq. (7) becomes 


(iw/y)m2= —M hy+[H.— (2AR/M,) lm, 
es (\/yM,) (H.m,—M.h,), 


(iw/y)my=M -h.—[H.— (2AR*/M,) |m, 
che (A/yM,) (Hm,— M hy), (8b) 


m,=0. (8c) 


It is seen that Eqs. (5) and (8) have a solution for 
which h,=hy=m=0, and ,%0. This wave is not 
excited by the assumed incident field, and we shall 
henceforth be concerned only with those waves for 
which h,=0. If h, is eliminated by means of Eq. (5b), 
then Eqs. (8a), (8b), and (5a) constitute a system of 
three linear homogeneous equations for the unknowns 
My, Mz, and h,. Introducing the dimensionless pa- 
rameters, 


(8a) 


n= H,/(4rM.,), (9a) 
Q=w/(4rM.y), 9b) 
L=)/ (My), 9c) 
&=A/(27M 28), 9d) 
K=ke, (9e) 
into these three equations, we obtain 

(K°—1—n)m,+ (2+ Ln)m,— (L/4r)hz=0, 
— (G24 Lnt+-L)m,+ (K?—n)mz+ (1/42)hz=0, 
—&riem,+ (K?—2ie)hz=0. 

7C. Kittel, Phys. Rev. 71, 270 (1947). 


(10a) 
(10b) 
(10c) 
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In order that Eqs. (10) possess a nonvanishing 
solution, the determinant of the coefficients must vanish. 
This requirement leads to the secular equation 


K®—¢,K*+c2.K?—c;=0, 
where ¢1, C2, ¢3 are given by 
= 1+2n+ 2ie’, 
co=n—O?+710L+ nL (212+ L) 
+7?°(1+L?)+4ie(1+n), (12b) 
c3= Qie’{ (1+-)?-O?+L(1+n)[2i2+L(1+7n)]}. (12c) 


Since the secular equation is cubic in K’, there are 
three propagation constants k,, ke, ks (obtained from 
K;, Ke, K3) whose real part is positive; the correspond- 
ing three waves, propagating along i,, represent energy 
flow into the metal. (The other three waves, propa- 
gating along —i,, are of no physical interest because 
the metal sample is assumed to be very thick compared 
to the penetration depth of the microwaves, so that 
there are no reflected waves inside the metal.) For each 
of the three waves (Ki,K2,K3) the field components 
can be expressed in terms of h, provided Eq. (11) is 
solved. The resulting relations can be written in com- 
pact form by affixing the subscript » (w=1, 2,3) to 
specify which of the three K-values is referred to. 
Using any two of the Eqs. (10) [although Eqs. (10a) 
and (10b) prove to be the most convenient ], we thus 
obtain 


(11) 


(12a) 


(13) 
(14) 


where #, and v,, being abbreviations for certain known 
functions of the coefficients of Eqs. (10), evidently 
depend on K,,?. Similarly, Eq. (1b) leads to 


€ne= (CK ,/420€5)hnz, 
and Eqs. (5b) and (14) give 


Rey ae 4rrvaltnz- 


Maz = tinhtns, 


May= Otae, 


(15) 


(16) 


Ill. BOUNDARY CONDITIONS 


The observable electromagnetic properties of a metal 
are fully determined by specifying, in the case of linear 
polarization, the ratio of the tangential components of 
e and h at the air-metal boundary. We call this ratio 
the “surface impedance” Z, so that for our field con- 
figuration 


Z= (€2/hz) ymo- (17) 


The Z defined by Eq. (17) is dimensionless and would 
have to be multiplied by a factor having the dimensions 
of (velocity)-!, such as (4x/c), in order to have the 
dimensions of an impedance. However, the simple 
definition of Z given by Eq. (17) is adequate when 
used consistently. 

In the absence of exchange effects the value of Z and 
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the amplitude of the single wave existing in the metal 
can be calculated, as is well known, by using the 
boundary conditions satisfied by the tangential compo- 
nents of e and h. For the present case these conditions 
are 


(18) 
(19) 


h, continuous at y=0, 
e, continuous at y=0. 


In the presence of exchange effects, however, two 
additional waves exist in the metal (see Sec. II), so 
that the calculation of Z and of the amplitudes of the 
three waves requires two new boundary conditions. 
These new conditions, formulated below, are evidently 
a consequence of exchange effects. 

As discussed in connection with Eq. (2), the effective 
exchange field is 


Hex= (2A/M?)V/M= (24/M ?)0’m/dy’, 


an expression which is usually derived’ from the Dirac 
cosine coupling between neighboring spins. Since this 
coupling implies that the torque exerted by the ith spin 
on the jth spin is equal and opposite to the torque 
exerted by the jth spin on the ith spin, it follows that 
the total exchange torque inside the specimen vanishes. 
Consequently, the total exchange torque per unit area 
of air-metal boundary is given by 


f MxXH.xdy=0, 
0 


where the upper limit of integration is taken as infinity 
since the sample is assumed to be much thicker than 
the skin depth. Integrating by parts, we thus obtain 


(2A/M?)[MX 0m/dy |” =0, 
which leads (with |m|<«<M,) to 
(24/M,?)M,i.X (dm/dy)y-0=0, 


because (0m/dy),-.. evidently vanishes. Since A is not 
zero, and m is always perpendicular to i,, we obtain 
the new boundary conditions 


(dm,/dy)=0 at y=0, 
(dm,/dy)=0 at y=0, 


which we shall use for calculating Z. 

It should be noted that in the arguments leading up 
to Eqs. (20) and (21), we have omitted any explicit 
consideration of the exchange field at the air-metal 
boundary. But at this boundary the exchange field is 
not equal to the H.x used above because at y=0 the 
—y direction is not equivalent to the +y direction. 
In fact, we find by extending the usual derivation! of the 
effective exchange field that at the boundary 


(H.x) y=0 
=(24/M?)[(f/a)(dm/dy)+(m/dy") y-0, 


(20) 
(21) 


(22a) 
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where f is a numerical factor of order unity that 
depends on the lattice type, and a is the lattice spacing. 
Thus the total exchange torque per unit area of air- 
metal boundary is given by 


(2A/M a] mx (f/a)(a/f’)(Om/dy)y=0 
+ f “Mx (@#m/2y")4y |=, (22b) 


where f’ is a numerical factor of order unity which is 
defined in such a way that (a/f’) is half the separation 
between neighboring spins along the y direction. (For 
a b.c.c. lattice, f= f’=4.) While Eq. (22b) evidently 
leads to the same boundary conditions, Eqs. (20) and 
(21), which we derived above, it should be noted that 
the first term in the brackets involves the assumption 
that the exchange torque in a slab of area unity and 
thickness (a/f’) may be expressed in terms of the mean 
exchange torque density in the slab. This assumption 
is admittedly questionable, but we believe it to be 
equivalent to the usual “continuum hypothesis” which 
is implied whenever H,, is described by a differential 
expression. This continuum hypothesis asserts that the 
point lattice of electron spins envisaged in Heisenberg’s 
model of ferromagnetism may legitimately be replaced 
by a continuum for the purpose of calculating H.x. As 
long as the effective skin depth is large compared to 
the lattice spacing a, the continuum hypothesis as well 
as our boundary condition is probably a good approxi- 
mation. , 

Using the values of the microwave components 
given by Eqs. (13), (14), and (15), we can now write 
down the boundary condition equations (denoted by 
primes) which correspond to Eqs. (18), (19), (20), and 
(21). Introducing the abbreviation 

Z' = (4roed/c)Z, (23) 


and taking into account all three waves in the metal, 
we thus obtain 
(18)’ 


(19)’ 
(20)’ 
(21)’ 


Iyethesthse= hoz, 
Kyhiz+ Kohozt+Kohgz=Z'Io:, 
1K jhi2+ uoK shez t+ usK sh3z=0, 
01K yh z+ 00K short 03K sh32=0, 


where ho, denotes the value of /, in air. This system 
of four linear homogeneous equations possesses a non- 
vanishing solution provided the condition 


os Vie 1 1 
Ph of: des 
O Ki Ke u3K3| — 
0 11K 1 vKe 13K 3 


is satisfied. Equation (24) will be used in the following 
section to calculate Z’ and hence Z. 


0 (24) 


W. S. AMENT AND G. T. RADO 


IV. THE SURFACE IMPEDANCE AND THE 
EQUIVALENT ISOTROPIC PERMEABILITY 

Using Eq. (24), we could now express Z’ in terms of 
known quantities if the secular equation (11) had 
actually been solved. While Eq. (11) could, in principle, 
be solved in closed form, such a solution would be rather 
cumbersome and not very useful. We shall circumvent 
this difficulty by making use of the fact that Z’ can be 
represented in terms of certain symmetric functions of 
the roots (K1,K2,K3) of Eq. (11). To see this, notice 
that the coefficients of the elements 1, Z’ of the first 
column of Eq. (24) are antisymmetric polynomial func- 
tions of Ki, Ke, K3. All such functions have a common 
antisymmetric factor, the remaining symmetric factors 
being directly expressible, through the theory of bi- 
alternants,® in terms of certain symmetric functions 
P, Q, R, which will be defined later. This fact so 
simplifies the algebra that we shall use the term 
“bialternant method” to denote the present procedure 
of bypassing the explicit solution of the secular poly- 
nomial. 

Solving Eq. (24) for Z’ and inserting the explicit 
expressions (which we have not written down) for the 
u,, and , in terms of the K,’, we obtain after a lengthy 
calculation 

R(QP-—R 
wis (QP—R) | (25) 
RP+R(1+2n)+Qd 


where P, Q, R denote the symmetric functions 
P=KiK2+K2K3+ K3Ki, 
Q0=KitK2+Ks, 
R=K,K2K;3, 





(26a) 
(26b) 
(26c) 
and d is an abbreviation for the quantity 
d=q—P+i0L+y(n+20L+L?+L%). (27) 


Next we make use of the well-known fact that the 
roots of the cubic equation (11) are related to its 
coefficients by the equations 


K?Y+K?%+K?=4, 
KYK?+K/K?+K?KPY=0c2, 
K?YK?K?=c;. 
From Eqs. (26) and (28), we now obtain 
=c,+2P, 
P?=¢2+2c;10, 
R= cs, 


(28a) 
(28b) 
(28c) 


(29a) 
(29b) 
(29c) 





where the sign of c;? must be taken as positive because 
we are only interested in waves whose propagation 
constant has a positive real part. Finally, we could (in 


8A. C. Aitken, Determinants and Matrices (Oliver and Boyd, 
Edinburgh and London, 1951); see especially p. 117. 
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principle) solve the simultaneous quadratic equations 
(29a) and (29b) for P and Q, substitute the resulting 
values [together with the R from Eq. (29c)] into Eq. 
(25), and thus calculate Z’ or Z. However, we shall 
not carry out these steps in full generality because it is 
more convenient to use the analytic approximation 
discussed below. 

Once Z is calculated, the comparison with experi- 
mental results is most easily carried out by introducing 
the concept of “equivalent isotropic permeability,” 
denoted by equ. Following RW, we define pequ to be 
that isotropic complex permeability (equ=ui—iu2) 
which gives rise to the same surface impedance as the 
actual relation obtaining between the vectors b and h. 
Since it is easily shown that in an isotropic situation, 
characterized by b=pequh, Maxwell’s equations for a 
metal lead to Z= (tequ/€ert)?, where €ets= —4ric/w is 
the effective dielectric constant, it follows that 


Hequ= — 2i(cZ/w5)’=—(i/2)(Z'/e), (30) 


where the quantity ¢, defined by Eq. (9d), should not 
be confused with éere. The equ calculated from Eq. 
(30), which we may call (equ)cate, Can be compared 
with the results of resonance experiments by deducing 
from the latter a value for (sequ)exper- TO do this, one 
simply interprets the measured quality factor and 
resonance frequency of a cavity (or attenuation factor 
and phase velocity of a transmission line) on the basis 
of Maxwell’s equations by proceeding as if b and h 
were related by b=(uequ)experM, and then compares 
(tequ) exper with (sequester 

Returning to the problem of actually calculating 2’, 
and hence pequ, by the method outlined above, we now 
restrict ourselves to the case where each of the quantities 
n, ©, L?, QL, and e¢ is negligible compared to unity. 
With these approximations, which are valid in the RW 
experiments, the “coupling constant” c3! of Eqs. (29a) 
and (29b) is quite small and permits an approximate 
solution of these simultaneous equations. Equations 
(29), (25), and (30) then lead to our final result: 


n—P+i0L+2¢(1+1) 
[n-9+i0L+e(1+i) 
where n, Q, L, and ¢ are defined by Eqs. (9a) through 


(9d). 
V. DISCUSSION AND GENERALIZATION 
OF THE RESULT 
If we disregard the exchange effect, so that A and e 
vanish, then Eq. (31) gives 


Hequ= 1/(n—2?+i0L), (31a) 


a special result that can be derived without using the 
methods of the present paper. To do this, one simply 
ignores Maxwell’s equations [except for the demagnet- 
izing condition (5b) ], solves the equation of motion 
[Eq. (2)] subject to the approximations noted in the 


(31) 





Mequ>= 
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last paragraph, calculates the quantity u.=b./h,, and 
identifies uz with wequ. The result thus obtained, which 
is identical with Eq. (31a), shows that in this special 
case the line width and shape are essentially determined 
by A, and that the resonance condition is 7=?. Since 
in our work »=H,/(4rM,) is neglected compared to 
unity, the condition »= is evidently equivalent to 
Kittel’s formula,’ w=+7(B,H,)!, for the resonance con- 
dition in the absence of exchange. 

If, on the other hand, we disregard the phenomeno- 
logical damping effect, so that \ and LZ vanish, then 
Eq. (31) gives 

n—?+-2€(1+12) 


~ [a OP-He(1-+i) 


Equation (31b) shows that the line width and shape 
are essentially determined by o and A, i.e., by the 
combined effect of eddy current dissipation and ex- 
change. The resonance field, defined as the H, corre- 
sponding to wi=0, is now given by n~?— (0.7044)e, 
thus being shifted to a value smaller than that predicted 
by Eq. (31a). We note that Eq. (31b) proves to be a 
fairly good representation of the experimental results 
of RW, who observed a fractional exchange shift of the 
resonance field amounting to 20 or 30 percent, and we 
refer to their papers for numerical evaluations of Eq. 
(31b) and for a detailed comparison between theory 
and experiment. 

Next we outline the method of curve matching used 
to extract A and g from the experimental equ. Intro- 
ducing the quantity 


N=(e+n—)/e, 
we write Eq. (31b) in the form 
equ (N+1+2i)/(N+i)?, 


so that if we regard the real number W as a parameter, 
we can construct a universal curve by plotting the 
imaginary part of the right-hand side of Eq. (33) as a 
function of the real part. The resulting curve is “egg- 
shaped” and contains points corresponding to all 
possible values of NV. Next we multiply all the experi- 
mental equ by a scale factor, to be identified with e«, 
which is chosen in such a way that the product 
€(uequexper, When plotted with its imaginary part as a 
function of its real part, matches the theoretical egg- 
shaped curve described above. This value of ¢ deter- 
mines A [see Eq. (9d) ] because M, is known and 6 can 
be calculated from w and o. Finally, we obtain g from 
any convenient point on the egg-shaped curve. To do 
this, we simply choose some value of H,, compute the 
corresponding €(equ)exper from the e determined above 
and the experimental data, and then ascertain from 
the egg-shaped curve the value of NV corresponding to 
this particular ¢(uequ)exper- Knowing N and H,, we 
then compute 2 from Eq. (32), and y (and hence g) 
from Eq. (9b). 





flees (31b) 


(32) 


(33) 
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It is worth noting in this connection that if exchange 
effects are absent, then Eq. (31a) shows that a plot of 
the imaginary part of uequ as a function of its real part 
should be a circle tangent to the real axis at the origin. 
Thus the appearance of an egg-shaped curve in the 
complex-plane representation of (uequ)exper immediately 
suggests that the “zero-exchange”’ formula (31a) is in- 
adequate, so that Eqs. (31b) or (31) must be considered 
in interpreting the experiments. 

To assess the error caused by the analytical approxi- 
mation involved in our final result, Eq. (31), we checked 
this equation by two methods. In the first method, we 
used a digital computer, the NAREC electronic com- 
puter at the Naval Research Laboratory, to solve 
Eqs. (29a) and (29b) numerically by a cyclic procedure 
of successive approximations. The numerical values of 
Q, L, and ¢ chosen for this purpose were typical of those 
encountered in the RW experiments, and H, was 
regarded as a parameter. The values of P and Q thus 
obtained, together with the R from Eq. (29c), were 
then substituted into Eqs. (25) and (30) to yield 
computed values of equ aS a function of H,. When 
plotted in the complex plane representation described 
above, these computed values of equ led to an egg- 
shaped curve whose ordinates and abscissas agreed to 
better than five percent with the prediction of Eg. (31). 
In the second method, presented in Appendix A, we 
used a power series expansion which contains more 
stringent approximations than those involved in the 
derivation of Eq. (31). However, this method permits 
an analytic solution of the secular equation (11) and 
leads to a final result which is invalid for a narrow 
range of H, values but agrees otherwise with Eq. (31). 

Since our final result, Eq. (31), was derived for the 
somewhat specialized physical conditions assumed in 
Sec. II, we shall now discuss three ways of generalizing 
these conditions (to apply to the experimental situation 
of RW) without altering the validity of Eq. (31). 

(1) Equation (31) presumes a plane sample. How- 
ever, it is easily shown that this equation is equally 
valid for a curved sample provided |1/k]| is negligible 
compared to the radius of curvature r. This condition 
means roughly that the “effective” skin depth is small 
compared to r so that the space dependence of the 
waves in the metal is exponential. Thus Eq. (31) is 
valid, for example, in the case of the cylindrical geom- 
etry used in the experiments of RW. In that case the 
proof involves the replacement of the Bessel functions 
appearing in the problem by their asymptotic values, 
as in the isotropic? situation. 

(2) Equation (31) presumes normal incidence of the 
microwaves upon the sample. However, in the RW 
experiments (and in most ferromagnetic resonance 
experiments) the propagation vector possesses a compo- 
nent parallel to the surface of the sample, and it is in 
fact from measured effects resulting from this parallel 


®M. H. Johnson and G. T. Rado, Phys. Rev. 75, 841 (1949). 
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component, such as the change due to H, of the quality 
factor and resonance frequency of a cavity resonator, 
that the surface impedance and hence pegqu is deter- 
mined experimentally. We must therefore investigate 
whether the oblique incidence of the microwaves upon 
the sample modifies Eq. (31). To do this, we assume 
that the microwave components m, h, and e are pro- 
portional to exp(iw!—ky— pz), thus adding a z-depend- 
ence, and repeat the calculation leading to Eqs. (5) 
and (8). We find that if we make the approximation 
(generally valid at microwave frequencies) that | "| 
is negligible compared to | k?|, and |75*p?/2| is negligible 
compared to unity, then Eqs. (5a), (5b), and (8) 
remain unchanged. Equation (5c), however, no longer 
predicts h,=0, but leads instead to the relation 


__@R/2) > 
1+ (8%2/2) k 


(34) 


Since the absolute magnitude of the first factor on the 
right-hand side of Eq. (34) is at most of order unity, 
we obtain |h,| <|h,p/k|, showing that h, is negligibly 
small compared to h,. But since h, is of the same order 
as the other “old” field components, we have effectively 
h,~0. This result is a consequence of the fact that the 
wavelength in the metal, which is of the order of the | 
“effective” skin depth |1/k], is negligibly small com- 
pared to the wavelength in the airspace bounded by 
the metal, which is of the order of |1/p| or ~2mc/w. 
Thus we see that in the approximation considered here, 
all of the Eqs. (5) and (8) are unchanged, and conse- 
quently the remaining theory and the final result in the 
case of oblique incidence is the same as in the case of 
normal incidence. 

(3) Equation (31) presumes that the unknown damp- 
ing mechanism is described by the phenomenological 
Landau-Lifshitz damping term. As mentioned in Sec. 
II, however, the validity of this term is by no means 
assured by existing experimental results, and some 
ferromagnetic resonance experiments” indicate, in fact, 
that the Bloch-type phenomenological damping term 
is sometimes preferable. To investigate the effect of 
Bloch-type damping on our result, we simply replace 
the last term in Eq. (7) (which is proportional to \) by 
the simpler term —m,,,/(yT2), where the quantity 7:, 
known as the transverse relaxation time, is (like \) a 
phenomenological constant. Thus we omit throughout 
our calculation all terms arising from the term Ahz,,/7 
of Eq. (7), and replace \ by M,/(H,7>) in the remaining 
terms. The final result obtained in this way turns out 
to be identical with Eq. (31b), so that the replacement 
of the Landau-Lifshitz damping by the Bloch damping 
leads to a replacement of Eq. (31) by Eq. (31b). This 
means that in our approximation the Bloch damping 
can be described by putting \=0 in Eq. (31). Since the 


J. A. Youn PAY and E. A. Uehling, Phys. Rev. 90, 990 
(1953); 94, 544 (1954 
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experiments of RW can be interpreted on the basis of 
\=0, ie., on the basis of Eq. (31b), the question of 
Landau-Lifshitz damping versus Bloch damping does 
; not arise in their case. 


VI. DISCUSSION OF PREVIOUS THEORIES 


Kittel and Herring" were the first to calculate an 
explicit magnitude for the exchange shift of the reso- 
nance. However, they used a perturbation method 
and did not take Maxwell’s equations properly 
into account. As discussed by RW, the formula of 
Kittel and Herring can be used for rough qualitative 
purposes but not for a quantitative prediction of the 
exchange shift. This is due to the fact that the Kittel- 
Herring formula contains the unknown factor (u2)unp, 
the imaginary part of the “unperturbed” permeability. 
It is clearly a task of the theory to predict (u2)unp from 
the fundamental constants of the material, and in the 
absence of such a prediction (u2)unp is not known 
a priori unless it is taken from experimental results. 
But since the exchange effect cannot be “switched off,” 
the experimental y2 automatically includes the exchange 
effect, and is therefore a “perturbed” ys, so that the 
Kittel-Herring perturbation treatment is not strictly 
valid unless the excharige effect is so small that it is 
experimentally uninteresting. We derived the Kittel- 
Herring result from our final equation (31) by a power 
series expansion, and investigated under what condition 
this derivation is valid. We found that to obtain their 
formula we had to assume that the quantity | «/(n— 
+iL)| is small compared to unity (which is generally 
not a permissible assumption), and that we had to 
carry the expansion to the second approximation in 
this quantity. Furthermore, we had to assume that d is 
not zero, an assumption which is particularly serious 
because the exchange effect must evidently be calculable 
for \=0, as shown by our Eq. (31b). It should also be 
noted that Kittel and Herring have not calculated the 
width and shape of the resonance line, so that their 
theory cannot be used to decide whether the width of 
an observed resonance line is due to eddy current losses 
(caused by the exchange effect) or relaxation phe- 
nomena. Finally, Kittel and Herring concluded that at 
microwave frequencies the exchange effects in ferro- 
magnetic resonance are not likely to be of importance 
in pure metals at room temperature, or in alloys at 
any temperature. 

In connection with a general discussion of internal 
fields in ferromagnetics, Macdonald” briefly refers to 
his unpublished calculations on exchange effects in 
ferromagnetic resonance and states that he agrees with 
the conclusion of Kittel and Herring mentioned above. 
Since this conclusion was contradicted by the RW 
experiments, we undertook the calculations of the 


1 C, Kittel and C. Herring, Phys. Rev. 77, 725 (1950). 
2 J. R. Macdonald, Proc. Phys. Soc. (London) A64, 968 (1951). 
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present paper and found that the conclusion of Kittel 
and Herring, and of Macdonald, is not justified. After 
the oral presentation® of our work, Dr. Macdonald 
kindly lent us his thesis'* which contains his calculations, 
and pointed out that he had independently arrived at 
similar methods and the same new boundary conditions 
[our Eqs. (20) and (21)] as we did. We therefore 
believe that Macdonald’s conclusion concerning the 
inappreciable magnitude of the exchange effect at room 
temperature is probably due to the complicated nature 
of his implicit final formulas, and to the fact that his 
numerical computations were limited to those condi- 
tions, such as the resonance in nickel at ~30000 
Mc/sec, where the exchange effect is indeed very small. 
Our final result [Eq. (31)], on the other hand, ad- 
mittedly lacks generality, but it is an explicit and useful 
formula that permits simple predictions within its 
range of applicability. Furthermore, the applicability 
of our Eq. (31) extends to just those situations in which 
physical considerations, discussed by RW, lead one to 
expect that the exchange effect is actually appreciable. 


APPENDIX A. AN APPROXIMATE METHOD 
OF SOLUTION 
As in Sec. IV, in the paragraph preceding Eq. (31), 
we again assume that each of the quantities n, 0, L?, 
and QL is negligible compared to unity. However, 
instead of assuming that ¢ is negligible compared to 
unity, we now make the more stringent assumption 
that &/|n—?+i2L| is negligible compared to unity. 
With these approximations, the secular equation (11) 
becomes 
K®— K*+ (n—0?+i0L)K?—2ie=0, (A1) 
and yields the approximate roots 

Ky, ?=3{ (n—Y+i0L) +[ (n—2?+ i0L)?— 8ie }$}, (A2a) 
K?=1. (A2b) 

From Eq. (10c), we now obtain (with n=1, 2, 3) 


Maz= (K,2/8rie)haz, (A3) 


and from Eqs. (10b) and (A3) 


(K,2—n)K2-+2ie 
Brie i-+-L) 


Ins. (A4) 





If Eqs. (A3) and (A4), which correspond to Eqs. (13) 
and (14), are substituted into Eq. (10a), the latter 
yields Eq. (A1) and is therefore identically satisfied. 
In a similar way, we obtain an approximate expression 
for én. Next we write down the boundary condition 
determinant (24), substitute for the «, and », from 
Eqs. (A3) and (A4), add a certain multiple of the 
fourth row to a certain multiple of the third row, and 


18 J. R. Macdonald, Ph.D. Thesis, Oxford, 1950 (unpublished). 
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obtain 


S. <2 1 1 
Z’ Ky Ke Ks =0 
0 (Ki'+2ie)Ki (Ke+2ie)K2 (Ks'+2ie)K3| 
0 K 2 K 2 K. 2 
(AS) 
Since in our approximation we can write 
Ky+2ie+1~1, 
KA+2ie+1~1, 
K+2ie+1=2, 
Eq. (A5) becomes 


eo 1 
Z’ Ki Ke 
0 Ki K+ 


If we now disregard the case Ki=Ke, which will be 
discussed later, Eq. (A6) yields 


, K1K2(Kit Ke) 
K2+K1K:+K? 


which can be combined with Eqs. (30) and (A2a) to 
give the final result: 


n—-+i0L+2(1+4) 
eee TO i0L+e(1+i) 


=0. (A6) 





(A7) 





(A8) 


in agreement with Eq. (31). 
To analyze the limitations of this simple derivation 
we now distinguish two cases. (1) If \=0, then Ki~Ke, 
but the quantity &/|n—0?+iQL| becomes very large 
if H, is such that 7 satisfies 7~?, so that Eq. (A8) is 
not valid in the immediate vicinity of this H,. (2) If 
\+0, with \ being very small, then the same limitation 
exists as in case (1). But if 40, with d being arbitrary, 
then we have the additional limitation that Eq. (A8) 
is invalid if Kj= Ky». The latter situation arises if w is 
such that satisfies simultaneously the conditions 
n—W=2¢ and LQ=2e obtained from Eq. (A2a). 

The method given in this Appendix is useful because 
it leads to explicit expressions for the three propagation 
constants and because it permits a simple derivation 
of our final result for wequ. All the limitations described 
in the previous paragraph apply, of course, to the 
method used to derive Eq. (A8) and not to this equation 
itself, since the same result, Eq. (31), was derived 
without these limitations by using the bialternant 
method of Sec. IV. 


APPENDIX B. THE DISPERSION LAW FOR 
THERMAL SPIN WAVES 


In considering thermally excited spin waves we can 
restrict ourselves to frequencies which are sufficiently 
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high to satisfy the relation 
w~kgT/h, (B1) 


where kg, T, and & denote Boltzmann’s constant, the 
absolute temperature, and (Planck’s constant/2z), 
respectively. We consider, moreover, only the type of 
linear polarization assumed in Sec. II. Equation (B1) 
shows that Q is now very large and thus permits us to 
write the secular equation (11) in the simplified form 


K*— K*—0°K?+-2i1ekP =0 (B2) 


for any typical ferromagnetic metal at all but extremely 
low temperatures. The approximate solutions of Eq. 
(B2) are 

Kr?=2ie, (B3) 


K 2 2 =+0. (B4) 


From the definition of K, Eq. (9e), it is seen that 
Eq. (B3) gives 


k= 21/8, (BS) 


which is precisely what one obtains from Maxwell’s 
equations for permeability unity. Consequently the 
spin wave corresponding to this & is “nonmagnetic,” 
being characterized by m,=m,=0, so that it cannot 
give rise to a deviation of M, from Mo, the value of 
M, at T=0. 

Equation (B4), on the other hand, shows that the 
wave corresponding to the plus sign is attenuated while 
that corresponding to the minus sign is not, and that k 
for the latter wave is given by 


w= — (2Ay/M,)F, (B6) 


which agrees with the usual dispersion law" for thermal 
spin waves and thus. leads to the Bloch T!-law for 
(Mo—M,)/Mo. It is rather satisfying that Eq. (B6) 
agrees with the result of the standard treatments 
because the latter ignore the conductivity of the metal, 
and either neglect the magnetic interactions between 
the spins or treat them by magnetostatics rather than 
by Maxwell’s equations. We further note that the 
thermal spin waves, which give rise to the difference 
between My and M,, are on the whole much shorter 
than the microwaves employed in resonance experi- 
ments. The wavelengths of the latter may be estimated 
from Eq. (A2a), which shows that in most situations, 
including the RW spin wave resonance experiments, the 
wavelengths of the microwaves in the metal will not be 
much smaller than 10- cm. Thus we suggest that at 
temperatures sufficiently far below the Curie point, 
where the spin wave picture is at least approximately 
valid, the concept of a saturation magnetization vector 
may legitimately be used not only in ferromagnetic 
resonance but even in spin wave resonance. 


4 See Herring and Kittel, reference 5. 
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Data are reported which describe the isothermal time rate of change of electrical resistivity in a specimen 
of CusAu following a quench from a temperature above the critical temperature (393°C) to one below it. 
The experimental arrangements permit observation within one minute after the initiation of the quench. 
At quench temperatures above 364°C the resistivity rises and then decreases, at first rapidly and then more 
slowly. The initial rate of fall increases as the quench temperature is lowered. Below 364°C the early rise is 
absent and the rate of fall decreases with decreasing temperature. The several phenomena are consistent with 
the hypothesis that the processes involved are the formation, by statistical fluctuation, of stable antiphase 
nuclei of order in the disordered matrix, the growth of these nuclei, and the coalescence of the resultant 
antiphase domains. An approximate quantitative theory is developed. Measurements of the isothermal 
time rate of change of resistivity in an annealed specimen following sudden change of temperature below 
the critical temperature are in accord with the kinetic theory recently proposed by Jerome Rothstein. 





INTRODUCTION 


HE crystalline structure of the alloy Cu;Au is 
face-centered cubic. The atomic array has long 
range order when the gold atoms preferentially occupy 
a single one of the four constituent simple cubic sub- 
lattices. This order is conveniently measured by a 
parameter S defined by the formula 


S=(4/3)(e—-D=4e-D, (A) 


where rg is the fraction of gold atoms on the chosen 
sublattice and rq the fraction of copper atoms on the 
remaining three sublattices. The equilibrium value of S, 
corresponding to a minimum in free energy, decreases 


from unity at 0°K to 0.84 at about 665°K.! At this - 


critical temperature the values S=0.84 and S=0 are 
both stable, above it only S=0. The existence of long 
range order is a consequence of the fact that the mutual 
energy of two unlike atoms is less than that of two like 
atoms in the same positions. The following argument 
due to Bethe? accounts for the critical temperature and 
indicates the nature of the transition that occurs there. 

Imagine an ordered crystal divided by a boundary 
surface into two regions, and the order in one of them 
altered by distributing the gold atoms on a different 
sublattice so as to leave the value of S unchanged in 
the region. Then the value of S for the whole crystal is 
decreased, since a smaller fraction of gold atoms now 
occupy a single sublattice; the entropy ¢ is increased 
by an amount Ag because the original number of atomic 
arrangements is multiplied by the number of ways in 
which the boundary can be inserted ; and the configura- 
tional energy E is increased by an amount AE due to the 
increase in the number of like nearest neighbors at the 


t Publication assisted by the Ernest Kempton Adams Fund for 
Physical Research of Columbia University. 
j t Now at the Bell Telephone Laboratories, Murray Hill, New 
ersey. 
(1981) T. Keating and B. E. Warren, J. Appl. Phys. 22, 286 
1). ; 
*H. A. Bethe, Proc. Roy. Soc. (London) A150, 552 (1935). 
(oa F. C. Nix and W. Shockley, Revs. Modern Phys. 10, 1 
8). 


boundary. The new configuration is unstable at a tem- 
perature T provided AE>T Ag. At a higher temperature 
the equilibrium order and hence the boundary energy 
AE are smaller, while the quantity 7A¢ is larger. At the 
critical temperature the two terms are equal, and the 
appearance of ‘“‘antiphase’’ regions of the sort described 
produces no increase in free energy. Thus, as in Cu;Au, 
two values of S, one zero, may be stable at the critical 
temperature. The boundary energy appears as a latent 
heat of transition from one to the other. The foregoing 
ideas have been developed quantitatively by Bethe, 


. Chang,’ Peierls,* and Cowley.® 


The order at temperatures above the critical tem- 
perature 7, is best described in terms of the atomic 
constitution of the several shells of atoms about, e.g., 
a gold atom. The parameters introduced by Cowley® 
are the a; defined by the formula 


a;= 1—4n,/3c:, 


where m; is the number of copper atoms among the ¢; 
atoms in the ith shell surrounding a gold atom. Cowley 
computed the temperature variation of the first ten a; 
by analysis of the observed diffuse scattering of x-rays, 
and so showed that the average local order approaches 
that characteristic of long-range order as the tempera- 
ture is lowered toward T,. Fluctuations are superim- 
posed on the average order, hence it is reasonable to 
assume that regions of appreciable order containing 
large numbers of atoms have a transient existence at 
temperatures not too far above 7,. 

The purpose of this research is to study the isothermal 
processes by which equilibrium long-range order is 
achieved in a specimen which has been quenched from 
a temperature above 7, to one below it and held there. 
The extensive observations of Sykes and Evans’ supple- 
mented by the calorimetric measurements of Sykes and 


3 T. S. Chang, Proc. Roy. Soc. (London) A161, 546 (1937). 
4R. Peierls, Proc. Roy. Soc. (London) A154, 207 (1936). 

5 J. M. Cowley, Phys. Rev. 77, 669 (1950). 

6 J. M. Cowley, J. Appl. Phys. 21, 24 (1950). 

7™C. Sykes and H. Evans, J. Inst. Metals 58, 255 (1936). 
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Jones? led these investigators to the conclusion that the 
initial processes are the formation and growth of ordered 
nuclei in a disordered matrix. The present work con- 
firms this hypothesis and affords sufficient detail to 
encourage rough quantitative analysis in terms of the 
nucleation theory of Becker? and Turnbull.” It is 
further found that the progress toward new equilibrium 
order following sudden change in temperature below 7, 
in a specimen effectively free of antiphase domains is in 
excellent agreement with the kinetic theory of ordering 
and disordering recently proposed by Rothstein." Elec- 
trical resistivity is employed throughout these experi- 
ments as an indicator of the state of affairs in the 


material. 
EXPERIMENTAL METHOD 


Specimen Material 


The specimen material was prepared by the firm of 
Handy and Harmon, Bridgeport, Connecticut, in the 
form of a wire 25 mils in diameter. The atomic composi- 
tion is 24.96-+-0.05 percent gold and 75.04+0.05 percent 
copper, with spectroscopic traces of Fe, Pt, and Ag. The 
wire as received was annealed at 825°C for eight hours. 
The temperature variation of the electrical resistivity 
was then measured in exact accordance with the pro- 
cedure described by Sykes and Evans’ and Siegel.” 
The result is in excellent agreement with the observa- 
tions of these investigators. Photomicrographic analysis 
at this stage revealed that the average linear dimensions 
of the constituent microcrystals are 0.12 mm to 0.15 mm 
longitudinally and 0.090 mm transversely. The critical 
temperature lies between 392°C and 393°C. The value 
T.= 393°C is adopted here. 


Temperature Control 


The temperature of a solid can be raised several 
hundred degrees by suspending it in an evacuated trans- 
parent container and irradiating it with the output of 
a few General Electric Company reflector infrared 
heater lamps. This phenomenon is the basis of the 
radiation furnace perfected in this laboratory by Mr. 
Leonard Weisberg and employed in the present research. 

Figure 1 is a sectional drawing of the cylindrical 
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Fic. 1. Cross sectional drawing of the irradiated assembly. 


8 C. Sykes and F. W. Jones, Proc. Roy. Soc. (London) A157, 
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10D. Turnbull, Am. Inst. Mining Met. Engrs., Tech. Publ. No. 
2365 (1948). 

11 J. Rothstein, Phys. Rev. 94,.1429(A) (1954). 

12S. Siegel, Phys. Rev. 57, 537 (1940). 
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irradiated element. The specimen wire, 10 cm in length, 
lies on the axis. It is surrounded coaxially over most of 
its length by two cylindrical copper skirts constructed 
by drilling nearly through two pieces of quarter inch 
copper rod, so as to leave 15 mils wall thickness. The 
solid ends of the skirts are drilled axially to fit the wire, 
and a 10-mil longitudinal slit terminating in a small 
hole is cut in each skirt wall. The potential leads for 
the resistance measurements are pieces of the specimen 
material drawn down to 8 mils diameter. Two sym- 
metrically placed holes of the same diameter are drilled 
6 cm apart through the specimen, and the leads are 
inserted and affixed with a minute application of silver 
solder. 

The skirts are threaded on the specimen so that the 
potential leads pass through the holes in the skirt walls, 
from which they are insulated by bits of ceramic tubing. 
The solid ends of the skirts are then squeezed in hard on 
the specimen with a hydraulic press. The two skirts are 
almost but not quite in contact at the center. Two 
longitudinal slots, 10 mils wide and 18 mils deep, are cut 
in the solid ends of each skirt, and into these the ends of 
No. 30 chromel and alumel thermocouple wires are 
peened. Lastly, the slits in the skirt walls are closed 
with Dupont silver No. 4887. 

The assembly shown in Fig. 1 is mounted axially on 
glass threads in a horizontal glass tube 2 inches in 
diameter and 18 inches long. One end of this tube, 
closed by a brass fitting and O-ring, is connected to the 
pumping system, and through it pass the current, 
potential and thermocouple leads. The other end is 
connected through a manifold of three valves with 
three chambers of 1.4 cubic inches capacity containing 
helium at predetermined pressure for quenching. 

Each skirt is irradiated radially by four 250-watt heat 
lamps arranged quadrantally about the axis. The tem- 
perature of each skirt is stabilized by its own thermo- 
couple, which is connected to a potentiometer. The 
light beam from the potentiometer galvanometer is 
deflected by a right angled prism into one of two photo- 
cells. The amplified photocurrents operate a relay which 
inserts or removes a small resistance in series with the 
variac through which power is supplied the lamps. 
A portion of the light beam is reflected to a visible scale. 
The device stabilizes the temperature of the specimen 
within 0.1°C at 400°C. 


Resistance Measurement 


The resistance of the specimen is measured with a 
Kelvin double bridge of the Wolff type operated with a 
sinusoidally varying emf of 1000 cycles/sec frequency. 
The components of one pair of ratio resistances are 
500 ohms each, while those of the other pair are variable 
and are mechanically coupled so as to remain always 
equal. The latter are four decade resistors whose re- 
sistances per step are 100, 10, 1, and 0.1 ohms respec- 
tively. The bridge detector is a cathode ray oscilloscope 
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preceded by a two stage amplifier. The voltage gain of 
the amplifier and bridge transformer is 700. The com- 
parison resistance in the bridge is 6 cm of No. 16 
manganin wire located in a thermally insulated box 
outside the specimen chamber. The returning current 
lead from the specimen passes through this box, and is 
there flexible. The emf of mutual induction between 
the current leads in the two bridge arms is annulled by 
manipulation of this flexible portion. The bridge sensi- 
tivity permits detection of a resistance change of 1 part 
in 5000. 
Accuracy 

The chromel-alumel thermocouples are calibrated by 
comparison with a platinum resistance thermometer 
certified by the U. S. Bureau of Standards. The wires 
are cut near the calibrated junctions and the ends are 
peened in the skirts. 

The relation between the temperature of the skirts 
and that of the specimen is established by measurements 
made on an assembly identical with that of Fig. 1 
except that the specimen is replaced by a length of 
No. 18 platinum wire, which is treated as a resistance 
thermometer. The observations reveal that the tem- 
perature of the platinum wire is lower than that of the 
skirts, but that the departure is no more than 1°C at 
400°C. It is assumed that the specimen behaves 
similarly. 

The resistances of the specimen and comparison arms 


of the bridge are of the order 0.02 ohm each, and of the 
bridge yoke 0.011 ohm. The accuracy of the bridge 
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ratio resistances specified by the maker is +0.05 per- 
cent. Accordingly, it is estimated that the accuracy of 
the resistance measurements is +0.2 percent. The 
accuracy of the resistivity computed from the resistance 
and the specimen dimensions is about 2 percent. 


Quenching Procedure 


The steps in the quenching procedure, in which two 
persons must participate, are the following: (1) The 
quenching chambers are filled with helium at the de- 
sired pressure. The latter is determined by the magni- 
tude of the temperature change to be effected, and the 
speed of the diffusion pump, a VMF 20-01, and of the 
forepumps, two megavacs in parallel. (2) The specimen 
temperature is stabilized at the desired initial value. 
(3) The lamps are switched off. (4) One or more helium 
manifold valves are opened. (5) The two potentiometer 
settings are altered to the value corresponding to the 
desired lower temperature. (6) When the light beams on 
the visual scales indicate near approach to the new tem- 
perature, power is supplied the lamps and this is ad- 
justed manually until the automatic control can safely 
take over. The pressure in the helium chambers must 
be such that the diffusion pump is operative at this 
time. (7) The decade bridge resistances are progressively 
altered by one observer, and the instant of bridge 
balance is read from a stopwatch by the other. 

Temperature equilibrium is established within one 
minute or less after the initiation of the change. Verifi- 
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Fic. 2. Isothermal variation of 
resistance with time following a 
quench from 418°C to the indi- 
cated temperature. 
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Fic. 3. Isothermal variation of 
resistance with time following a 
quench from 418°C to the indi- 
cated temperature. 
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Fic. 4. Details of the initial portions of curves similar to those 
of Figs. 2 and 3. The origin of each curve is 0.5 a.u. less than that 
of the curve above it. 


cation of this is afforded by observation of resistance 
changes associated with quenches completed at tem- 
peratures above 7,, where no ordering occurs, and also 
with quenches made on the platinum wire assembly. 

The steps in the procedure for effecting a sudden in- 
crease in temperature below T, are the same as (2), (3), 
(5), and (6) above except that in step (3) maximum 
power is supplied the lamps, and in step (6) the power 
is lowered to a value appropriate to stabilization at the 
new temperature. 

RESULTS 

The curves of Figs. 2, 3, and 4 show the isothermal 
variation of specimen resistance with time following a 
quench from 418°C to the indicated temperature. The 
resistivity p of the material is related to the resistance R 
(or R.), expressed in the arbitrary units (a.u.) here em- 
ployed by the formula p= (0.2255X 10-6 Rau.) ohm cm. 
R denotes the observed resistance at time ¢ after the 
initiation of the quench, and R, the resistance of the 
disordered specimen at the temperature of measure- 
ment. The temperature coefficient of resistance of the 
disordered material is 0.028 a.u./°C, and the resistance 
at 418°C is 63.56 a.u. Accordingly, R.=[63.56—0.028 
X (418—T)] a.u., where T is the temperature in degrees 
centigrade. 

The curve of Fig. 5 shows the variation of resistance 
with temperature immediately above 7,. The solid 
circles indicate the relative positions of the maxima of 
the curves of Fig. 4. 

The curves of Figs. 6 and 7 represent observations 
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made upon a specimen after annealing at 388°C for 50 
hours. The constituent microcrystals are then essen- 
tially single domains, as is indicated by the close 
proximity of the resistance to the equilibrium value for 
the temperature. Accordingly, these data describe the 
true kinetics of the ordering and disordering processes 
below 7., free from phenomena associated with anti- 
phase domains. 

The curves of Fig. 6 show the isothermal variation of 
resistance with time following a quench from 388°C to 
the indicated temperature. Here R denotes, as before, 
the observed resistance at time ¢ after the initiation of 
the quench, and R, the resistance of the specimen at the 
temperature of measurement and in the state of order 
associated with 388°C. Sykes and Evans? have shown 
that the temperature coefficient of resistivity of the 
material near equilibrium order is very nearly inde- 
pendent of order. The’ value is 0.051 a.u./°C. Accord- 
ingly, R.=[R:—0.051(388—T)] a.u., where R; is the 
resistance at 388°C. 

The curves of Fig. 7 show the isothermal varia- 
tion of resistance with time after the temperature is 
suddenly raised from 338°C to the indicated value. 
Here R-=[R;+0.051(T—338)] a.u., where R; is the 
resistance at 338°C. The values of R; at 388°C and 
338°C decreased about 2 parts in 500 during the several 
days of observation following the 50-hour anneal. The 
initial values were 50.37 a.u. at 388°C and 39.70 a.u. 
at 338°C. 

DISCUSSION 


Ordering Processes 


A specimen disordered at a temperature above 7, and 
quenched to a temperature below 7, is in a state of 
metastable equilibrium. Local fluctuations in free 
energy produce small ordered regions called “embryos” 
throughout the material. In virtue of the increase in 
surface energy associated with the formation of an 
embryo, most embryos are unstable and vanish; how- 
ever, embryos of greater than a certain critical size are 
stable and grow, with accompanying decrease in the 
free energy of the system composed of the ordered 
embryo and the disordered matrix. It is likely that 
neighboring stable embryos are antiphase, and growth 
persists with concurrent increase in the order within 
the embryo toward the equilibrium value for the tem- 
perature, until the boundaries of adjacent antiphase 
regions coincide. Stable embryos thus constitute “nuclei” 
for the establishment of equilibrium order within nu- 
merous contiguous antiphase domains in the medium. 

The occurrence of an embryo of given size in a dis- 
ordered material is the result of many atomic inter- 
changes, and the chance that these occur simultaneously 
is slight. Smaller embryos must appear before larger 
ones.’ Embryos are present at the instant the quench 


18 The mechanism of the formation of embryos in the cognate 
phenomenon of phase precipitation in solid solutions has been 
described in detail by Turnbull, reference 10. 
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Fic. 5. The variation of resistance with temperature immedi- 
ately above 7.. The solid circles indicate the relative positions of 
the maxima of the curves of Fig. 4. 


temperature is reached, but if these are appreciably 
smaller than the critical size for that temperature a 
finite “incubation period” must precede the formation 
of a nucleus. 

Accordingly, the first ordering process is the forma- 
tion of nuclei and the second their growth into con- 
tiguous antiphase domains of equilibrium order. The 
third process is the coalescence of these domains until 
a highly stable structure results which has been com- 
pared by Bragg to that of a stable foam. In this 
structure the domains are nearly equal sized regular 
polyhedra with approximately plane bounding surfaces, 
and there is no tendency for one domain to absorb 
another. Furthermore, as remarked by Sykes and Jones, 
as the nuclei grow they absorb gold and copper atoms 
in the ratio 1:3, and any excess of atoms of either kind 
consequent upon initial fluctuations in atomic composi- 
tion are concentrated at the domain boundaries. Addi- 








R-R. in ARBITRARY UNITS 








1 1 1 


1 : L 1 
TIME IN MINUTES 





Fic. 6. Isothermal time variation of resistance of an annealed 
specimen following a quench from 388°C to the indicated tem- 


perature. 
4 W. L. Bragg, Proc. Phys. Soc. (London) 52, 105 (1940). 
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Fic. 7. Isothermal time variation of resistance of an annealed 
specimen following a sudden increase in temperature from 338°C 
to the indicated temperature. 
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tional coalescence involves movement of this material 
along the boundaries. Thus the fourth and final process 
in the establishment of equilibrium order throughout 
the specimen is extremely slow even at temperatures 
near the critical temperature. Owen and Sim" observed 
the time variation of domain size in a quenched speci- 
men during annealing at 350°C. The linear dimensions 
of the domains increased to 71 interatomic distances in 
30 hours, and showed no further change after 87 hours. 


Theory of Order 


The basis here adopted for a quantitative analysis of 
the nucleation process is the theory of order proposed 
by Cowley.® In this theory the configurational energy 
is assumed to be the sum of energies of interaction 
between pairs of atoms, one of which is in the ith shell 
of atoms surrounding the other. These energies are 
denoted by V4a,:; Vaz,i, Vaz,; according as the atoms 
of the pair are both copper, both gold, or one of each. 
The configurational energy £ of an ordered lattice of V 
atoms, referred to a state of complete disorder, is given 
by the expression 

E=—iSNW, (2) 
where 
W=Vi—3V2+2V3—::-, (3) 
and 
Vi=3(Va4,it+V pp, i)—VaB,i- (4) 


This form is valid only for values of S different from 
zero. The equilibrium long-range order is related to the 
temperature below 7, by the equation 


(4+S*) (3+? 16 T. 
n| — oem aii, (5) 
(1—s?)? 3 T 
8 E, A. Owen and M. Sim, Phil. Mag. 38, 342 (1947). 


where 
T.= (3/2k)W. (6) 


The Helmholtz free energy per atom f, referred to 
complete disorder, is given by the expression!® 


f=dekT[(1+3S) In(1+3S)+6(1—S) In(1—S) 
+3(3+S) In(1+3S)]-3SW. (7) 


Nucleation Process!’ 


The parameter S is here employed as a measure of the 
order in nuclei. This is a valid first approximation when 
the nuclear size is large of the order 10 interatomic dis- 
tances or more. 

The change in free energy AF associated with the 
formation of an embryo in a completely disordered 
matrix can be represented by the expression 


AF=AF,+AF,+AF,, (8) 


where AF, is the change in free energy due to ordering 
within the embryo as given by Eq. (7), AF, is the work 
done in forming the interface between ordered embryo 
and disordered matrix, and AF, is the elastic strain 
energy associated with the contraction of the material 
on ordering.'* AF, is intrinsically negative, but AF, and 
AF, are positive. AF, is a minimum when the bounding 
surfaces of the embryo are the (111) slip planes of the 
face-centered cubic lattice, so that the resistance to 
shear is a minimum, and when the shape of the embryo 
is a disk.!® It is assumed that the embryos conform to 
this pattern, and the term AF, is neglected. 

The method for evaluating AF, is as follows. Imagine 
a volume of material in order S divided into two 
regions a and 6 by a plane surface of area o, and the 
same volume of disordered material divided into regions 
c and d by the same area. This is the first configuration. 
Now let region ¢ be joined to region a on the area o, 
and d joined to 6 similarly. This is the second con- 
figuration. The surface energy associated with an area 
20 separating disordered material from material of order 
S is the energy of the second configuration minus that 
of the first. 

The surface energy is assumed that contributed by 
nearest neighbors alone, and is computed for each of 
the four bounding surfaces by simply counting the 
average number of nearest neighbor bonds of the types 
AA, BB, and AB acting across them. For example, the 
average number of AB-type bonds acting across the 
ordered-ordered surface is computed as follows. Figure 8 
is a diagram of the atomic arrangement on two adjacent 
(111) planes in the crystal. Here the copper sites are 

16 The entropy term is computed by R. H. Fowler and E. A. 
Guggenheim, Statistical Thermodynamics (Cambridge University 
Press, London, 1939), p. 600. 

17The method adopted follows closely that employed by 


R. Becker in his study of phase precipitation in solid solutions, 
reference 9. 

18 FE. A. Owen and Y. H. Liu, Phil. Mag. 38, 354 (1947); also 
F. C. Nix and D. MacNair, Phys. Rev. 60, 320 (1941). 

19 F, R. N. Nabarro, Proc. Phys. Soc. (London) 52, 90 (1940). 
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labelled a and the gold sites 8, and the primed quantities 
refer to plane I and the unprimed to plane II. It will 
be noted that the nearest neighbors of the a’ sites are 
two a and one £, while the nearest neighbors of the §’ 
sites are three a. Now in accordance with Eqs. (1), the 
numbers of a’ sites occupied by A and B atoms respec- 
tively are 3%;(3+-.5)N, and 34(1—S)N., where N, is the 
total number of atoms on ‘the area a. Similarly the 
numbers of 8’ sites occupied by A and B atoms respec- 
tively are 3(1—S)N, and 7¢(1+35)N.. Hence the 
average number of AB-type bonds contributed by the 
a’ sites is 


ref (3+S)[%+2(1—ra) J+ (1—S)[(1—16)+2ra}} No. 


Similarly, the average number of AB-type bonds con- 
tributed by the ’ sites is 


{76 (1+3S)3rati6(1—S)3(1—-ra)} No. 


A calculation of this sort yields the numbers of bonds 
of each type acting across each of the four surfaces 
(the two order-disorder surfaces are identical), and so 
the expression for AF,. The result is 


AF,= teS°ViN, a) (9) 


where V; is given by Eq. (4). 

The shape of an embryo is assumed, for simplicity, 
to be a square disk of edge L interatomic distances and 
edge to thickness ratio five. Equations (8) and (9) 
yield accordingly the expression 


AF= (1/5) fL+ (21/40)S°ViL’, 


in which f is given by Eq. (7). Cowley’s measurements 
of x-ray scattering gave Vi=358k and W=371k. The 
value of W computed with Eq. (6) and the observed 
critical temperature is 444k. The value V,;=400% is 
adopted here. In the foregoing expressions for V; and W 
the quantity k is to be regarded as a dimensionless 
constant whose value is 1.380 10—'*. The energies and 
free energies computed with the formulas here given are 
then evaluated in ergs. 

The variation of AF with L at various temperatures 
is depicted in Fig. 9, where L* denotes the nuclear size 


(10) 


Fic. 8. The arrangement of copper a sites and gold 8 sites on 
two adjacent (111) planes of the lattice. The planes are distin- 
guished by primed and unprimed symbols. 
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Fic. 9. Variation of embryo free energy with size and temperature. 


and AF* the fluctuation in free energy required to 
produce a nucleus in a disordered matrix. The variation 
of L*, plotted as ordinate, with S between S=0 and 
S=1 is a U-shaped curve at all temperatures, with a 
minimum at about S=0.4. Hence this value is used in 
computing the curves. Relaxation of the constituent 
material toward equilibrium order is concurrent with 
nuclear growth. 

It will be noted that as the quench temperature is 
lowered from 7, the number density of nuclei, which is 
proportional to exp(— AF*/kT), increases rapidly, while 
the quantities AF* and L* both decrease, and with them 
the incubation period for the formation of a nucleus. 
Furthermore, since the density increases far more 
rapidly than the size diminishes, a temperature is 
implied at which the nuclear boundaries are contiguous 
on formation. 

The nuclear dimensions indicated in Fig. 9 are con- 
sistent with the antiphase domain sizes inferred by 
Sykes and Jones from the width and intensity of the 
superlattice lines formed by x-ray diffraction. On the 
present view, the nucleus must be smaller than the 
domain into which it grows. The average domain size is 
75 interatomic distances in a specimen cooled from 
above J, at 30°C per hour, and 6 to 8 interatomic 
distances in a specimen water quenched to room tem- 
perature and heated to 130°C, at which temperature 
equilibrium order within the domains is established. 


Electrical Resistivity 


The isothermal resistivity of a crystal ordered on a 
single sublattice decreases with increasing order. How- 
ever, as remarked by Sykes and Jones, the presence of 
antiphase domains in the crystal destroys its homo- 
geneity and so must disturb this behavior when the 
size of the domains is comparable with the mean free 
path of the conduction electrons, Electron reflections 
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at the domain boundaries then increase the resistivity 
above the value corresponding to the order within. The 
magnitude of the change increases with diminishing 
domain size; when the domains are sufficiently small 
the resistivity is that of a completely disordered ma- 
terial. The mean free path A of the conduction electrons 
in monovalent metals is given by the expression 
\= (h/pe*)(3/mN?)*, where p is the resistivity, e the 
electronic charge, and N the atomic density.” The 
resistivity is 10‘ emu for the ordered and 1.410‘ emu 
for the disordered material. The density is 12 g/cm’, 
and the lattice constant is 3.7X10-* cm. The corre- 
sponding values of \ are 38 and 27 interatomic distances. 

Small embryos initially present in a specimen 
quenched to, say, 380°C act as scattering centers for 
the electrons. Nuclei of high order, as they appear, 
enhance the inhomogeneity with associated increase in 
resistivity. Subsequent nuclear growth decreases the 
resistivity by removal of disordered material, and 
domain coalescence tends still further to restore homo- 
geneity to the medium. 

The variation of the resistivity with temperature 
above T, is linear except in the neighborhood of T,, 
where the variation is as shown in Fig. 5. The increase 
of resistivity above the linearly extrapolated value 
indicated by the dashed line is produced by embryos 
whose size and number density increase as the tempera- 
ture is lowered toward T,. If a; is a measure of the local 
order, the mean square fluctuation in a; is given by 
the expression”! m 


{ (Acx;)*) ay = nd*f(a) Joa? 


where f(a;) is the free energy per atom and m is the 
number of atoms in the embryo. The free energy is 
independent of local order at the critical temperature. 
The phenomenon is similar to critical opalescence in 
liquids. 

Ordering in Quenched Specimens 


The curves of Figs. 2, 3, and 4 will now be discussed 
in detail. The incubation period is identified approxi- 
mately with the elapsed time after quenching before 
the maximum of resistivity is reached. Values of these 
maxima are plotted as a function of the quench tem- 
perature with the solid circles of Fig. 5. _ 

T=390.8°C. The nuclei are large and their number 
density small. Hence the incubation period is long and 
the increase of resistivity slight. The entire curve repre- 
sents the formation and growth of nuclei. Contiguous 
domains have not been established 130 minutes after 
quenching. 


*®N. F. Mott and H. Jones, The Theory of the Properties of 
o and Alloys (Oxford University Press, London, 1936), 


p. 268. 
1A. H. Cottrell, Theoretical Structural Metallurgy (Edward 
Arnold and Company, London, 1951), p. 232. 
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T=389.6°C to 364.3°C. The nuclei are progressively 
smaller and their number density larger as the tempera- 
ture is lowered. The incubation period is not revealed 
by the measurements below about 360°C, where an 
appreciable number of embryos initially present are of 
nuclear size. The resistivity rise at first increases with 
the increase in the number of scattering centers (nuclei), 
and then decreases as the effect of scattering is annulled 
by the growth of nuclei present at the instant of quench. 

The rate of removal of disordered material, and there- 
fore the rate of fall of resistivity, increase with the 
number of growing nuclei. This number reaches a 
maximum at 364°C. Nuclear growth is stopped when 
the nuclear boundaries touch, and subsequent reduction 
of resistivity follows the removal of disordered material 
by the slower process of domain coalescence. 

T=364.3°C to 286°C. As the temperature is lowered 
from 364°C and the density of nuclei increases corre- 
spondingly, a greater proportion of nuclei are con- 
tiguous on formation and a lesser proportion are free 
to grow. Hence the rate of fall of resistivity decreases 
with decreasing temperature. It appears that in the 
neighborhood of 300°C all the nuclei are effectively 
contiguous on formation. The data at 286°C then 
represent entirely the process of domain coalescence. 

Referring to Fig. 4, the initial small rate of change of 
resistivity with time below 360°C. represents a balance 
between the effect of nuclear growth, which is to de- 
crease the resistivity, and that of the formation of new 
nuclei, which is to increase it by scattering. 


Ordering in a Single Domain 


Figures 6 and 7 show the relaxation of order from one 
equilibrium value to another following a sudden change 
in temperature in a specimen effectively free of anti- 
phase domains. These curves represent the true kinetics 
of ordering and disordering per se. In accordance with 
Rothstein’s theory,” the time variation of the iso- 
thermal resistivity R; is given by the formula 


(Ri—Ro)/(Re—Ro) = coth (yi+-¢), (11) 


or the formula 
(R:— Ro)/(R.-— Ro) = tanh(yt+ 6), 


according as the initial temperature is greater than or 
less than the final constant temperature. Here R, is the 
equilibrium resistivity and Ro the resistivity for perfect 
order, both corresponding to the temperature of meas- 
urement; ¢ is the time after quenching; y and é are 
temperature dependent constants. Values of R, are 
obtained from the curve relating equilibrium resistivity 
and temperature, and Rp from a linear extrapolation of 
this curve upward from low temperatures, where the 


(12) 


2 An abstract of Rothstein’s theory appears in reference 11. 
A complete presentation, including further discussion of the data 
of Figs. 6 and 7, is in preparation, 
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order parameter S is nearly unity. Values of y and « are 
determined by the data. The plotted points of Figs. 6 
and 7 represent observations, and the curves are graphs 
of Eqs. (11) and (12). 

It will be noted that the data are consistent with the 
assumption, introduced earlier, that the processes of 
nuclear growth and ordering within nuclei are simul- 
taneous in specimens quenched from above T,. 
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In conclusion the authors acknowledge with gratitude 
their indebtedness to Dr. Charles D. Coxe and his 
associates of Handy and Harmon, who prepared the 
specimen material; to Professor Victor K. La Mer of 
Columbia University for helpful advice on nucleation 
theory ; and to Mr. Leonard Weisberg of this laboratory, 
who devised the radiation furnace and assisted in many 
of the measurements. 
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Electrical Properties of Gallium Antimonide* 


D. P. DetwitERt 
The Franklin Institute Laboratories for Research and Development, Philadelphia, Pennsylvania 
; {Received December 9, 1954) 


Data are presented on the conductivity and Hall coefficient of several samples of GaSb over the tempera- 
ture from —196°C to 650°C. The lowest room-temperature conductivity obtained was 12 ohm™ cm“, 
All material produced from zone-purified components was p-type. N-type material was produced by doping 
with tellurium, as were p-n junctions. The intrinsic band gap is estimated from junction rectification data 
to be 0.78 ev at —196°C. The mobility of electrons was found by measurement on n-type material to be 
several times greater than the hole mobility. The mobilities of both holes and electrons are found to vary 


approximately as 7—! in the lattice scattering range. 


INTRODUCTION 


ONSIDERABLE interest in the semiconducting 
properties of intermetallic compounds, particularly 
those formed by the combination of a group three and 
a group five element, has developed during the last 
several years.'-* Leifer and Dunlap‘ have recently 
published the results of their studies of the properties 
of a relatively pure sample of p-type GaSb. The 
present work includes several p-type samples, the purest 
of which is comparable to that of Leifer and Dunlap, 
as well as an n-type sample. Hall effect and conductivity 
measurements were made over the temperature range 
from — 196° to 650°C. In addition, the current-voltage 
characteristics of a grown p-m junction were measured 
at room temperature and at —196°C. 


PREPARATION OF MATERIALS 


GaSb was formed by the direct combination of the 
zone-refined components®:* in approximately stoichio- 
metric properties. It was found that this could be done 
most conveniently by mixing the purified components 


* This research was supported by the U. S. Air Force, through 
the Office of Scientific Research of the Air Research and Develop- 
ment Command. 

+The Franklin Institute Laboratories for Research and 
Development, now at the New York State College of Ceramics, 
Alfred, New York. 

Jee Welker, Z. Naturforsch. 7a, 744 (1952); Bon 248 (1953). 

*R. G. Breckenridge, Phys. Rev. 90, 488 (1953). 

3M. Tanenbaum and J. P. Maita, Phys. Rev. 91, 1009 oe 

‘H. N. Leifer and W. - ar Tr., Phys. Rev. 95, 51 (1954) 

5 Tanenbaum, Goss, and Pfann, J. Metals 6, 762 (195 4). 

*D. P. Detwiler and W. M. Fox, J. Metals 7, 205 (1955). 


in the zone-refining boat and permitting the reaction 
to occur as the first molten zone was passed through 
the charge. Any excess of either component is quickly 
transported to the end of the ingot. Since gallium 
exhibits some tendency to wet the silica boats employed, 
a slight excess of antimony was usually added to 
insure complete reaction of the gallium. 

Zone-refining of antimony and GaSb and subsequent 
GaSb crystal growing by the Czrochralski method were 
carried out under a purified hydrogen atmosphere. 
Electrolytic hydrogen, freed of oxygen and water by 
passing successively through a catalytic purifier, a 
high-voltage discharge, a CaSO, drying tower, and a 
liquid nitrogen cooled trap, was passed continuously 
over the charge. The resulting ingots of both antimony 
and GaSb exhibited a very clean, mirror-like surface as 
compared to the dull matte surface obtained with less 
pure hydrogen. 

All of the material produced in this fashion exhibited 
p-type conductivity, the value depending upon the pur- 
ity of the gallium employed. The less pure GaSb samples 
were prepared from relatively impure gallium which 
had not been zone-refined. These facts, together with 
the observation that the purity of GaSb made from 
the same gallium after zone-refining was as high as 
any produced, indicates the presence in at least some 
gallium of an impurity which is not effectively removed 
from the compound by the zone-refining process. 

N-type material and p-n junctions were prepared by 
doping with an alloy of about one atomic percent 
tellurium in GaSb when growing crystals. 
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Fic. 1. The cdnductivity of gallium antimonide as a function of reciprocal temperature. 


EXPERIMENTAL PROCEDURE 


The conductivity and Hall coefficient of samples with 
dimensions of approximately 1.0 cm by 0.1 cm by 
0.3 cm was measured over the temperature range from 
— 196°C to 650°C in a vacuum furnace. The samples 
were supported on a lavite sample-holder mounted 
inside a mica-lined heavy-wall copper cavity. Mica 
lining was employed to avoid evaporation of copper 
onto the specimens at the higher temperatures. A 
vacuum of about 10-> mm of Hg was maintained 
during measurements. 

The temperature was determined with a calibrated 
chromel-alumel thermocouple mounted in the sample 
chamber. A second thermocouple in good thermal 
contact with the heater windings was employed in 
conjunction with a photoelectrically controlled thyra- 
tron for temperature control. With this circuit the 
temperature of the sample could be maintained constant 
to +0.02°C over long periods. Temperatures below 
room temperature were obtained by immersing the 
entire vacuum furnace in liquid nitrogen and supplying 
sufficient heat to reach the desired temperature. 

Potential contacts to the specimen for conductivity 
and Hall coefficient measurements were made with 
stainless steel whiskers. After being placed in pressure 


contact with the sample, the probes were lightly 
welded by a high-voltage discharge from a Tesla coil. 
This resulted in a quite stable, low resistance, non- 
rectifying contact of very small area. Remeasurement of 
a sample at low-temperature after heating to the highest 
temperature indicated the absence of any effect of the 
contacts upon the bulk properties. Current contacts 
to the specimens were large-area stainless steel springs. 
All potential measurements were made with a Wenner 
thermo-free potentiometer. 

The magnetic field employed in determining the Hall 
constant was produced by an electromagnet. The 
field was controlled by an electronic controller to an 
accuracy of +0.02 percent, and could be varied from 
zero to 5000 oersteds. No dependence of Hall constant 
upon field strength was found in the purest specimens 
for fields from 200 to 2500 oersteds. 

Samples were cut from grown crystals with either a 
diamond or a carborundum saw and were etched with 
a dilute HCI-HNO; etch. A more concentrated solution 
of the same acids acts as a chemical polish. It was found 
possible to achieve a concentration such that a p-m 
junction could be located visually because of the differ- 
ence of polishing activity between n-type and p-type 
materials. 
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Fic. 2. The Hall coefficient of gallium antimonide as a function of reciprocal temperature. 


For measurement of p-m junction characteristics, greater than approximately 0.01 ev at this con- 


contacts were soldered to samples with pure indium centration. — 
solder, using a zinc chloride flux. This results in me- Figure 3 shows,the mobility values computed from 


the measured Hall coefficients and conductivities. 
Sample B-2p exhibits the expected 7-! dependence 
RESULTS AND DISCUSSION 10,000 


8,000 


A. Conductivity and Hall Coefficient oni 


Conductivity and Hall coefficient data on several oui 
samples, both n-type and p-type, are shown in Figs. 
1 and 2, respectively. All samples except that designated 
A-1 were single crystals; all except B-2n were p-type. 
Samples A-1 and B-2p are shown for comparison 
with the results of Leifer and Dunlap.* The purity is 
very similar to theirs, illustrating the observation that 
various workers have obtained essentially the same 
limiting purity. 

An impurity activation energy of 0.025 ev is calcu- 
lated from the Hall coefficient vs temperature of sample 
B-2p, in good agreement with the previous results.‘ 
The Hall coefficients of samples A-2 and B-2n, contain- 
ing respectively 1.2X10'8 and 7X10!” impurities per 
cm’, however, indicate that at this concentration the " ” —— jin ee 
impurities are completely ionized at temperatures as Fro. 3. The mobility of charge carriers in 
low as —196°C. This implies an activation energy not gallium antimonide. 


chanically reliable, nonrectifying contacts. 


mobility (cm?/volt /sec) 
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Fic. 4. A p-n junction rectification characteristic 
at room temperature and — 196°C. 


of mobility upon temperature in the higher temperature 
region where lattice vibrations are the dominant 
scattering mechanism. In the somewhat less pure 
sample, A-2, however, the mobility appears to depend 
slightly less strongly upon temperature, while the 
electron mobility in sample B-2n appears to depend 
somewhat more strongly upon temperature. 

The ratio of electron to hole mobility may be com- 
puted in sample B-2p by the method discussed by 
Shockley.’ This leads to a value of 6.5 as compared to 
the ratio of 2.3 for the mobility of majority carriers in 
sample B-2n to that in sample B-2p at the temperature 
of the Hall reversal. This is+ considered reasonable 
agreement;in view of the lower purity of the sample 
B-2n and the dependence of mobility upon purity. 

The chief problem at present in this work appears to 
be the preparation of GaSb of purity comparable to that 
of germanium. A limiting purity of about 10" carriers 
per cm’ has been obtained by several workers.'*:8 This 
is believed to be the result of a slight deviation of the 
composition from stoichiometry rather than a chemical 
impurity effect, since the results of the various labora- 
tories agree so closely. Further, it has been found in the 
course of the present investigation by analysis of the 
distillate from previously zone-refined GaSb heated 
under vacuum that antimony is evaporated prefer- 
entially. Several samples collected at various distillation 


™W. Shockley, Electrons and Holes in Semiconductors (D. Van 
Nostrand Publishing Company, New York, 1950). 
8D. A. Jenny (private communication). 


temperatures above the melting point showed antimony 
contents of from 75 to 95 atomic percent in the distillate. 
This is in agreement with the observed p-type con- 
ductivity in the present materials. 


B. p-n Junction Rectification 


Several f-n junctions were grown from the melt by 
doping an initially p-type melt with tellurium when the 
crystal was partly grown. Figure 4 shows the dc 
characteristic of one of these junctions as measured at 
room temperature and at —196°C. Although photo- 
sensitivity of the junction was noted at high levels of 
illumination, no difference could be observed between 
measurements in darkness and in room light. Conse- 
quently, most measurements were carried out in room 
light. 

Rectification ratios at one volt of about 200 and 500 
were observed at room temperature and —196°C 
respectively. However, considerable “softening” of the 
reverse characteristic is found with no well-defined 
saturation region. This and the rather low photo- 
sensitivity are believed to be manifestations of the 
existence of a small minority carrier lifetime in this 
material. 

The width of the forbidden energy gap, E,, may be 
determined from the forward p-n junction characteristic 
in the region where the current becomes linear with 
voltage. Here the applied voltage is greater than £,, 
so the current is limited only by the ohmic resistance of 
the specimen. Extrapolating this region back to the 
zero-current axis, one obtains the voltage required 
to overcome the barrier at the junction, ie., E,. 
Performing this extrapolation on the low-temperature 
data shown in Fig. 4, one obtains a value for E, of 
0.78 ev. This compares very well with the value of 0.77 
ev calculated from Leifer and Dunlap’s‘ values of E, 
at room temperature and a, the temperature coefficient 
of E,. 
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Field Emission from Rhenium: Emission Pattern Corresponding to 
Hexagonal Crystal Structure*{ 
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Field emission patterns from a clean metal having the hexagonal crystal structure (rhenium) are photo- 
graphed. The fabrication of rhenium needle-shaped cathodes having tip radii of the order of 510-5 cm 
is described, and electron microscope shadowgraphs of such emitters are shown. The use of a high-tempera- 
ture flash in vacuum to smooth and clean the emitter is illustrated. Emission pattern detail is correlated 
with crystallographic structure; in particular, the crystal faces with low Miller-Bravais indices have reduced 
current density, resulting, it is assumed, from values of the work function which are higher than the average 
value. Rhenium is shown to be more resistant to work function change by oxygen adsorption than is tungsten 
by a direct comparison under identical experimental conditions. 





INTRODUCTION 


ERTAIN physical and chemical properties of the 
metal rhenium have been known for some time,!-* 
and further intensive studies of the element have been 
undertaken recently. As a result, several important 
thermal and electrical constants of rhenium have been 
determined and others are now known with greater 
accuracy. Meanwhile, metallurgical investigations have 
led to the fabrication of smaller-diameter rhenium 
wire for the first time.* As several of the physical 
properties of this metal appeared superior, for field 
emission purposes, to properties of some of the more 
commonly used emitter materials, a study of the 
performance of rhenium field emitters was initiated. 
This paper reports the initial results of that study, 
including the first field emission patterns from rhenium 
and the observation that such cathodes are relatively 
immune to poisoning by oxygen adsorption, a factor 
contributing to the electrical instability of tungsten 
cathodes in typical experimental environments.® 
One reason for selecting rhenium is its relatively 
high melting point. It is often desired to clean the 
surface of a field emitter thoroughly without causing 
excessive changes in its geometry. Such cleaning may 
help stabilize the electron emission®; moreover, in 
studying the effects of surface adsorbates upon the 
emission characteristics, emitters are usually cleaned 
before the adsorbate is introduced onto their surfaces.*-® 
Such cleaning may be accomplished at temperatures 
suitable for evaporation of adsorbates from the cathode 
surface ; however, a high melting point cathode material 
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is then needed to minimize undesired geometric changes 
in the emitter. As the melting point of rhenium’? is 
about 3170°C, it appeared probable that the metal 
could be cleaned and its field emission pattern obtained, 
if one assumes that a method for fabricating such 
cathodes could be developed. 

Another reason for studying rhenium is that its 
crystal structure (hexagonal close-pack) is different 
from those of other previously used emitter materials. 
According to Mueller,’ methods for fabricating and 
successfully cleaning emitters have been developed for 
only nine different metals representing only the face- 
centered cubic and the body-centered cubic structures. 
Field emission patterns reported herein for rhenium 
are different from patterns published previously, as was 
expected since such patterns depend on crystallographic 
structure. Availability of such patterns will permit 
further study of several of the properties of rhenium. 
For example, estimates of the variation of work function 
over the rhenium single crystal surface are made herein 
and quantitative studies may now follow. 

For the present experiments, a few inches of 50-mil 
rhenium wire were obtained from Battelle Memorial 
Institute* and the construction of rhenium field emitters 
of size and smoothness comparable to that of good 
tungsten emitters was undertaken. A description of 
emitter fabrication and experimental results from 
rhenium emitters is given. 


EMITTER FABRICATION 


In order to develop the large cathode electric fields 
(107 to 108 v/cm) needed for field emission with 
reasonable values of the applied potential, needle- 
shaped cathodes are commonly employed. Because of 
their microscopic size, the tips of such needles are 
often shaped by electrochemical etch. It was found that 
rhenium could be electrolytically etched in an aqueous 
solution of 1.0 normal NaOH using the rhenium as one 
electrode and a piece of nickel, which is inert to this 
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solution, as the other.!°,At low ac or dc potentials, up 
to about 25 volts, the etched surface of rhenium was 
found to be highly irregular when examined under an 
optical microscope, the irregularities being many times 
the radius of the tip of a usable emitter," i.e., about 
5X10-* cm. A much smoother etched surface resulted 
if just the end of a piece of wire was immersed in the 
etching solution and a sufficiently high ac potential 
was then applied to produce a visible arc under the 
surface of the liquid.” 

Emitter blanks of rhenium, 3-mm lengths of 15-mil 
wire etched down from 50 mils, were spot-welded onto 
tungsten filaments." The blanks were then etched by 
means of a dc potential of from 5 to 10 volts until they 
appeared sharp to the unaided eye, although at this 
stage they were quite rough when examined under an 
optical microscope at 100X. The final etch and smooth- 
ing were accomplished by applying an ac potential 
in several short pulses, about 0.05 sec each, the value of 
the potential being just sufficient to produce a visible 
arc under the etching solution. The number of pulses 
required ranged from 10 to 30 and was determined by 
microscopic inspection of the emitter. Emitters thus 
produced were then examined in a type EMT-RCA 
electron microscope" and were found to have tip radii 
and smoothness suitable for use as field emitters. An 
electron microscopic shadow picture of a typical 
rhenium emitter fabricated in this manner is shown in 
Fig. 1. In contrast, emitter Re6 [Fig. 2(A)_] was etched 
as described above, including the ac arcs, except that 
no visible arc occurred during the final pulse. The 
excessive roughness of the tip of emitter Re6 was later 
removed by surface migration when the emitter was 
heated at a temperature of 2500°C for one minute in 
a vacuum. The result is shown in Fig. 2(B). The latter 
form has been established as a suitable configuration for 
a typical rhenium field emitter. 


EXPERIMENTAL METHOD 


Rhenium emitters, constructed as described above 
and examined for size and shape in an electron micro- 
scope, were mounted in Mueller type electron projection 
tubes of the particular design commonly used at this 
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Fic. 1. Electron microscope 
shadowgraph of a typical rhenium 
emitter, ReS5. 
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Fic. 2. Electron microscope shadowgraphs of rhenium emitter 
Re6; (A) following etch, and (B), showing the removal of surface 
irregularities by heating the emitter to 2500°C for one minute in 
a vacuum. 


laboratory." In such a tube, the electron stream 
diverged from the field emission cathode towards an 
anode consisting of a thin aluminum film evaporated 
onto a willemite phosphor screen,"*:“ the screen being 
deposited“ on the inner surface of one hemisphere of a 
spherical pyrex envelope. As the thin layer of alumi- 
num was transparent to electrons of the energies used 
herein, such electrons passed through the anode and 
struck the phosphor, thus producing a visible emission 
pattern as was first observed by Mueller.’ The tube was 
evacuated using a type QHQ-10-02 D.P.I. mercury 
diffusion pump and liquid air traps. After several cycles 
of outgassing and baking,!® the tube was sealed off 
from the vacuum system and the pressure of chemically 
active gases further reduced by means of a tantalum 
getter.!5 Pressures of chemically active gases of 10™ 
mm Hg are commonly attained.* The production of 
high potentials and measurement of field currents and 
applied voltages are described in reference 15. 


EXPERIMENTAL RESULTS 


Since much of the following analysis depends on the 
detail in field emission patterns, a few general com- 


ments concerning such patterns may be helpful. In the} 


case of tungsten, for example, the dark and light areas 
of the emission pattern have been correlated with the 
known crystallographic structure and its orientation 
relative to the emitter axis, and the appearance of the 
pattern from the clean metal is well known [see Fig. 
3(A)]. The symmetrically arranged dark areas of the 
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Fic. 3. Field emission patterns from tungsten showing (A) 
clean tungsten, (B) oxygen-tungsten which accumulated during 
two week exposure of emitter surface to residual gases in tube. 
Rhenium, under the same experimental conditions (Fig. 7) does 
not show a similar effect. 


tungsten pattern correspond to those faces of the 
monocrystalline emitter tip having low indices but 
high work function; therefore those faces have corre- 
spondingly reduced electron emission. Similarly, the 
lighter areas of the pattern correspond to crystal faces 
of higher indices but lower work function.”-!® 

Figure 4(A) shows the emission pattern obtained from 
a rhenium emitter in the sealed-off experimental tube 
before the emitter surface was cleaned. A criterion for 
cleanliness of the emitter, well established in the case of 
tungsten, is the relatively uniform distribution of 
current density over most of the cathode surface (i.e., 
lack of irregular bright or dark spots on the emission 
pattern’; therefore, it appeared probable that the 
pattern in Fig. 4(A) was not that of clean rhenium. 
The emitter was then given additional heat treatment 
as indicated by the headings of Figs. 4. For a time, 
Fig. 4(F) was judged to be the pattern of clean rhenium, 
but further heat treatment of other emitters showed 
that even greater uniformity of the current density 
distribution could: be attained. It is presently assumed 
that the pattern in Fig. 5 corresponds to clean rhenium, 
a form which persists after prolonged heating of the 
emitter at 2500°C in vacuum. 

A typical field current-voltage relationship obtained 
from a clean rhenium emitter during direct current op- 
eration is shown in Fig. 6. That the relationship between 
the logarithm of the current J and the reciprocal of 
the applied voltage V is linear is consistent with the 
empirical relation, 


I=C exp(—B/V), (1) 


given by Millikan and Lauritsen,!’ and shows that the 
observed current is due to field emission. In Eq. (1), 
Cand B are constants. 

For a number of purposes, a field emission cathode is 
desired whose electrical properties are stable in time in 
spite of the incidence upon its surface of typical im- 
purities found in vacuum tubes. The work function of 
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Fic. 4. Field emission patterns observed during the cleaning of 
emitter Re5S (see Fig. 1). (A). The emission pattern before the 
emitter surface was cleaned. The applied potential was 4000 volts. 
(B). Emission pattern after the emitter was flashed for 1 sec at 
2200°C, V=6000 volts. (C). Emission pattern after the emitter 
was heated for 1 min at 2300°C, V=6400 volts. (D). Emission 
pattern after the emitter was further heated for 1 min at 2500°C, 
V=8100 volts. (E). Emission pattern after an additional 1 min 
of heating at 2500°C, V=8800 volts. (F). Emission pattern after 
another 1 min interval of heating at 2500°C (total heating time 
at 2500°C was 3 min), V =9200 volts. 


tungsten, for example, is altered, and as a result so are 
its emission properties, during the adsorption of 
oxygen onto the cathode surface®:’; moreover, sufficient 
oxygen adsorption for this purpose is avoided with 


Fic. 5. Field emission pattern from a clean rhenium emitter (Re7). 
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Fic. 6. A typical current-voltage relationship obtained during the 
direct current operation of emitter Re7 (see Fig. 5). 


great difficulty in practice. Therefore, it was desirable 
to compare, under identical conditions, the effects of 
oxygen adsorption on rhenium with the known effects 
on tungsten. For this purpose, a rhenium emitter, ReS, 
the electron micrograph of which is shown in Fig. 1, was 
mounted in one projection tube, and a_ tungsten 
emitter was placed in a similar tube, both tubes con- 
nected together and sealed-off as part of the same 
evacuated system. The latter tube was used to give 
an indication of the pressure of chemically active gases 
by observations of the rate of contamination of the 
tungsten emitter.!* By this method, the pressure of 
chemically active gases after seal-off was judged to be 
less than 10-” mm of Hg. Also, by means of this 
experimental arrangement, the rates of contamination 
of tungsten and rhenium, as evidenced by their field 
emission patterns, could be compared while the two 
were under the same vacuum conditions. Both emitters 
were heated until clean and then allowed to stand with- 
out further heating (the patterns were observed occa- 


(A) (B) 


Fic. 7(A). Field emission pattern from rhenium emitter Re5 
after cleaning by flashing (1 sec) at 2500°C. (B). Pattern from 
emitter Re5 after a two week period in the same vacuum system 
as the tungsten emitter shown in Fig. 3. 


18 J. A. Becker (private communication). 


sionally) for a period of two weeks. At the end of that 
time, the tungsten pattern showed definite contamination 
of that emitter in a manner characteristic of adsorbed 
oxygen (Fig. 3),’ while no visible change occurred in 
the rhenium pattern (Fig. 7). It is concluded that the 
electrical properties of rhenium are more stable than 
those of tungsten in the presence of the given low 
pressure of oxygen gas. 


HEXAGONAL EMISSION PATTERN 


An examination of the patterns obtained from five 
rhenium emitters studied shows that, in each case, the 
axis of six-fold symmetry, resulting from the hep 
structure, was perpendicular (or nearly so) to the 
emitter axis. The other crystal directions appear tobe 
rotated at random, from emitter to emitter, about the 
axis of six-fold symmetry as may be seen by Figs. 5 and 7. 

The Miller-Bravais indexes” of the crystal planes 
corresponding to the dark areas of the pattern were 
determined from the geometry of the hexagonal 
close-pack structure with the aid of a crystal model. 
In the model (Figs. 8 and 9), the base plane has the 


Fic. 8. Photograph of a crystal 
model of the hexagonal close- 
packed structure showing the 0110 
plane which is perpendicular to the 
base of the model. Colored marbles 
make other planes (as indicated) 
more easily recognized. 


indices 0001. As has been indicated, the crystallographic 
direction corresponding to this plane is at right angles 
to the emitter axis. The two sets of planes perpendicular 
to the 0001 plane and having the next smallest indices 
make up the hexagonal prisms {0110} and {1120}. 
It can be seen from the model that, of these two sets of 
planes, the one exhibiting the smoothest surface has the 
indices {0110}. It has been established in the case of 
tungsten that the larger and darker areas of the emission 
pattern correspond to crystal planes of higher work 
function where electron emission is reduced.” The 
planes of highest work function are those of low indices 
and greater surface smoothness, a factor contributing 
to the work function values.!® The same rules permit 
correlation of pattern and crystallographic structure 
in the present case of rhenium. Accordingly, the large 
round dark area near the center of the emission patterns 
of Fig. 7 has the indices {1100}. If it is assigned the 
numbers 1010, the other dark holes are as indicated 
Fig. 10. That this is a choice of indices consistent with 
the geometry of the pattern may be verified as follows. 


19 F.C. Phillips, An Introduction to Crystallography (Longman 
Green and Company, New York, 1946). 
* M. H. Nichols, Phys. Rev. 57, 297 (1940). 
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Fic. 9. Photograph of crystal model of the hexagonal close- 
packed structure showing the 1120 plane which is represented by 
the black marbles. The 2112 plane is also in black marbles. 


It will be observed that the distance on the pattern 
between the 1122 and the 1122 areas is somewhat 
greater than the distance between the 1011 and the 
1011 areas. It can be shown from the geometry of the 
hep structure that the angular separation of the 1122 
and the 1122 directions is 63.0° and that of the 1011 
and the 1011 directions is 56.0°. Similar correlations 
between dark areas from all parts of the visible pattern 
and other crystal planes of low indices were made. 

It is concluded that the faces of the rhenium hcp 
structure having low indices yield less field current 
density than other faces at a given value of applied 
voltage. The effect is probably due to higher work 
function'®; however, it may be due in part to locally 
reduced field if such faces are shown to be planes of 
extended areas as is often true in the case of tungsten.!® 
If it is assumed that differences in intensity over the 
emission pattern are caused by differences in work 
function over the surface of the monocrystalline emitter 
tip, the variation in work function may be estimated 
to be in the order of 10 percent; a more quantitative 
study is possible".”!.22 and will be forthcoming. 


lM. K. Wilkinson, J. Appl. Phys. 24, 1203 (1953). 
® George Barnes, J. Am. Opt. Soc. 43, 1176 (1953). 


Fic. 10. Emission pattern from a metal of hexagonal close- 
packed crystal structure (rhenium) in which the dark “holes” 
are labeled with the Miller-Bravais indices of the corresponding 
planes of the monocrystalline emitter tip. 


The rhenium emission pattern also indicates that there 
are more crystal faces having high work function on 
the monocrystalline rhenium emitter tip than is the 
case for a tungsten emitter; this is a result which is to 
be expected because of the hexagonal crystal structure 
of the rhenium. 
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The twinning of the antiferroelectric crystals PbZrO; and NaNbO; has been studied. Essential differences 
between this twinning and the domain structure of ferroelectric BaTiO; have been observed. It is shown 
that both orthorhombic PbZrO; and NaNbO;are optically negative and the refractive index is smallest along 
the a axis, along which axis the main antiparallel ion shifts have been reported. The birefringence of the 
crystals has been measured as a function of the temperature up to the transition points, and compared with 


the temperature dependence of the spontaneous strain. 





I. INTRODUCTION 


OME crystals of the cubic perovskite type at higher 
temperatures are known to undergo, on cooling, 
phase transitions accompanied by dielectric anomalies. 
According to their behavior below the transition points, 
these crystals can be divided into two groups: ferro- 
electrics and antiferroelectrics. 

Crystals of the first group are characterized by a 
pronounced anomaly of the dielectric constant at the 
transition temperature, below which they show a re- 
versible spontaneous polarization. The most important 
crystal in this group is the well-studied BaTiO;,! which 
undergoes a phase transition at 120°C, being cubic 
above and polar-tetragonal below the transition tem- 
perature. Two subsequent phase changes are observed 
upon decreasing temperature: from polar-tetragonal to 
polar-orthorhombic at 5°C, and to polar-rhombohedral 
at —70°C. The crystal is thus spontaneously polarized 
first along a cubic edge, then along a face diagonal, 
and finally along a body diagonal. These polarizations 
are accompanied by strains which generally involve an 
extension in the direction of polarization and contrac- 
tions perpendicular to it. Ferroelectric properties have 
also been found in KNbO;,? PbTiO;,* and KTaO3;.4 

Crystals of the second group (antiferroelectrics) are 
characterized by phase changes in which the dielectric 
constants behave in a similar way as at a ferroelectric 
transition, but no hysteresis loops and no permanent 
polarization are found below the transition temperature. 
The transition is to a new nonpolar state, characterized 
by the existence of antiparallel dipole orientations 
within the lattice. The first crystal to be discovered in 
this group was PbZrO;.° Similar properties were subse- 
quently observed in PbHfO; * and NaNbOs.”:” 

The properties of the antiferroelectric PbZrO; have 
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been examined heretofore chiefly with polycrystalline 
ceramic samples. The dielectric constant of PbZrQ; 
shows a pronounced anomaly at 230°C, and yet no 
hysteresis loops can be observed below this temperature. 
The room temperature structure was first reported 
from powder photographs to be tetragonal® with lattice 
parameters: a=4.159A and c/a=0.988. The powder 
pattern contains some extra lines which can only be 
explained by assuming a multiple unit cell. A later 
x-ray and optical study® of some very minute crystals 
of PbZrO; revealed the true symmetry at room tem- 
perature to be orthorhombic, with the lattice param- 
eters given in Table I. One of the original cubic axes 
becomes an orthorhombic ¢ axis, and the other two 
orthorhombic axes lie at 45° to the cubic axes, as in 
the case of orthorhombic BaTiO; (see Fig. 1). The 
x-ray study indicated antiparallel shifts of the Pb ions 
along the a axis, as shown in Fig. 2. 

The orthorhombic 6 axis of PbZrO; is exactly equal 
to 2a; i.e., there is no measurable shear distortion of the 
ideal cubic lattice in the ab plane. This peculiar point 
will be discussed later. If comparisons are to be made 
between PbZrO; and BaTiOs, the properties of PbZrO; 
below 230°C should be compared with those of BaTiO; 
between 0°C and —70°C. 

The case of NaNbO; is similar to that of PbZrO. 
The room temperature structure is also orthorhombic, 


TaBLe I. Lattice parameters of orthorhombic modification of 
perovskite-type crystals. Orthorhombic axes a, b, and ¢ are related 
to monoclinic parameters* ao=bo, co, and 8B by: a= 2a sin(8/2), 
b= 2a» cos(8/2)Xm, c=coXn. 








ao/co g—90° 
1.006 8’ 
1.016 16’ 
1.012 ~0 
1.009 40’ 


b(A) 


5.67 
5.69 
5.88 X2 
5.51 


c(A) 


3.99 
3.97 
4.10X2 
3.88 X4 


Crystal 


BaTiO; 
KNbO;° 
PbZrO,4 
NaNbO;° 


a(A) 


5.68 
Die 
5.88 
5.57 











® co designates monoclinic axis. 
b See reference 12. 

© See reference 13. 

4 See reference 9. 


8H. D. Megaw, Proc. Phys. Soc. (London) 58, 133 (1946). 

® Sawaguchi, Maniwa, and Hoshino, Phys. Rev. 83, 1078 (1951). 

10 The technique used for growing these crystals is not given in 
reference 9, but a private communication of these authors states 
that the crystals were grown from a binary melt of PbZrO; 
and PbCle. 
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Fic. 1. Orthorhombic axes and axes of indicatrix in PbZrO. 


and antiparallel ion shifts are observed within the ab 
plane." The orthorhombic shear distortion is much 
larger than in PbZrOs3, however (see Table I). NaNbO; 
has a cubic lattice of the perovskite type above 640°C. 
Upon cooling, it undergoes three phase changes: at 
640°C, at 480°C, and at 360°C. The phases between 
these transition points both seem to be tetragonal. We 
will be concerned here mainly with the room-tempera- 
ture orthorhombic modification. 

Table I shows the lattice parameters of the ortho- 
thombic phases of the ferroelectrics BaTiO; and KNbOs, 
and of the antiferroelectrics PbZrO; and NaNbO;.?::!8 
For all of the crystals listed, the orthorhombic a axis is 
chosen to be along the direction of the main ion shifts. 
Parameters are also given in terms of a monoclinic 
lattice, for convenience in comparison of the distortions 
from the cubic lattice. 

The present study was undertaken as a further step 
toward an understanding of the mechanisms of anti- 
ferroelectric transitions in perovskite type crystals. 
The preparation of single crystals of PbZrO; is dis- 
cussed, and the optical properties of these and of 
NaNbO; single crystals are examined. Attention has 
been given to the type of twinning in these crystals. 


Il. PREPARATION OF LEAD ZIRCONATE CRYSTALS 


Severe evaporation of PbO at higher temperatures 
renders difficult the preparation of PbZrO; crystals 
from the pure melt. To determine the extent of this 
evaporation, the following experiment was repeated at 
several temperatures. About 1 g of finely powdered 
PbZrO; was placed in a covered Pt crucible; a globar 
oven was preheated to the specified temperature, after 
which the crucible was introduced, kept in it for one 
hour, and then removed. Cooling took place rapidly in 
air at room temperature. Results are presented in 
Table IT. 


TABLE II. Loss of PbO by evaporation from PbZrO; 
at high temperatures. 








Firing temperatures 
in °C % loss of PbO 





1000 0.4 
1100 3.5 
1200 27.9 
1300 53.7 





1 P, Vousden, Acta Cryst. 4, 545 (1951). 
”H. F. Kay and P. Vousden, Phil. Mag. 40, 1019 (1949). 
3 P. Vousden, Acta Cryst. 4, 373 (1951), 
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Antiferroelectric structure of PbZrO; 


Fic. 2. Antiferroelectric structure of PbZrO;, according to 
Sawaguchi e¢ al. (see reference 9). Arrows represent direction of 
shifts of Pb ions; solid line shows orthorhombic unit cell. 


Because of this loss of PbO, resort was had to the 
method of binary melts, in order that the preparation 
temperature be as low as possible. Experiments were 
conducted to find the proper flux, the proper ratio of 
flux to PbZrO;, and the most favorable temperature. 
The following fluxes were tried, the molar ratios of 
flux to PbZrO; ranging from 10:1 to 1:1:— NaF, 
NaCl, KF, KCl, K,COs, RbCl, B.03;, AIF;, CaCl, PbO, 
PbCl., and PbF:s. The maximum temperature was 
always lower than 1250°C. No crystals were obtained. 

Small crystals of PbZrO; have been obtained through 
the use of PbF»2, and recently also PbCla, as a flux, only 
for compositions close to the PbZrO; side of the phase 
diagram. Best results were obtained with a molar ratio 
of PbF, to PbZrO; of 1:2 (eg., 2.4 g PbF2+6.9 g 
PbZrO;). The mixture was placed in a covered Pt 
crucible, introduced into the furnace at 1250°C, kept 
at this temperature for about one hour, and then cooled 
at the rate of 50°C/hour. The product appeared as a 
dense polycrystalline conglomerate, but isolated crys- 
tals were found attached to the walls of the crucible 
immediately above the surface of the conglomerate. 
Most of these crystals appear roughly cubic, about 
0.3-mm edge length; occasionally octahedron shapes 
occur. They are transparent, of a light brown color, 
appear orthorhombic under the polarizing microscope, 
and show a transition to the cubic system at about 
230°C.45 The x-ray powder pattern is identical with 
that from a high-purity ceramic specimen of PbZrQ;.'® 

It appears that the crystals grow as a consequence of 


4 The dielectric behavior of high-purity ceramic specimens of 
PbZrO; suggested the existence of an intermediate phase approxi- 
mately between 225° and 233°C, on cooling only (see reference 15). 
Our optical investigation shows that this intermediate phase, if 
it exists, lies within less than one degree below the transition point ; 
but our experimental arrangement is not able to confirm the 
existence of this phase within our crystals. 

16 Shirane, Sawaguchi, and Takagi, Phys. Rev. 84, 476 (1951). 

16 Our measurements give the following tetragonal parameters: 
a=4,161+0.001A, c/a=0.988, in agreement with Megaw’s re- 
sults (see reference 8). 
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(a) (c) 


Fic. 3. Twinning of PbZrO; crystals, showing symmetrical ex- 
tinction: (a) Twinning on orthorhombic (110) plane. (b) Twinning 
on orthorhombic (111) plane. (c) Twinning on orthorhombic 
(4k0) plane. 


the evaporation of the solvent, PbF:. The surface level 
of the melt decreases progressively and leaves crystals 
attached to the crucible walls. No PbF»2 is detectable 
by x-ray patterns of the polycrystalline conglomerate; 
these show the PbZrO; pattern only. The weight loss 
is about 22 percent of the starting total weight. Since 
the initial percentage of PbF: by weight was about 62 
percent, and if we assume that the loss in weight is 
almost entirely due to evaporation of PbF», we are left 
with about 4 percent of PbF2, which is practically 
undetectable in the x-ray powder pattern. 

Unfortunately, every attempt to control the evapora- 
tion of PbF2, with the aim of growing larger crystals, 
was unsuccessful. Further research in this direction is 
in progress, as is the use of PbCl as a flux. 

The extraction of the crystals from the crucible at 
the end of the growth process appears quite critical. 
The removal of PbF2 remaining in the end product by 
means of solution in strong acids is impossible because 
PbZrO; itself undergoes a chemical reaction with these 
acids (PbCl: is formed in HCl, PbSO, in H2SO,, etc.). 
A few crystals can be extracted by mechanical scratch- 
ing. Platelike crystals are relatively easy to find. These 
are too small for dielectric measurements, but they are 
excellent for optical examination, twinning studies, and 
x-ray single-crystal examination. 

The NaNbO; crystals used for the optical measure- 
ments are the same crystals used for the previous di- 
electric and x-ray study.’ They were grown using NaF 
as a flux.? 


Ill. TWINNING 


PbZrO; crystals of about 0.3-mm edge length are 
always twinned, in a more or less complicated manner. 
The twin configuration can often be changed by heating 
the crystals to temperatures higher than the transition 
point, 230°C, and then cooling them rather rapidly. 
Slow cooling through the transition point sometimes 
reproduces the twin distribution existing before the 
heat treatment. The investigation of the influence of 
external stresses or fields on the twin boundaries could 
not be carried out because of the smallness of the 
crystals. 

The twinning rules for nonpolar crystals arise from 
geometric conditions. In the case of polar crystals, 
electrostatic conditions may also be influential; the 
twin boundaries must be free of charge, and interaction 
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Fic. 4. Twinning on orthorhombic (111) plane in PbZr0,: 
(a) Parallel extinction. (b) Parallel extinction: wedge laminae. 
(c) Mixed extinction: parallel and symmetrical. Double arrows 
show direction of orthorhombic ¢ axis. 
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energy between the twins must be considered. In ortho- 
rhombic BaTiO;, for example, the twinning of plates 
with symmetrical extinction takes place on an ortho- 
rhombic (110) plane!!7 [see Fig. 3(a) _]. Any plane (11/) 
is possible, as far as the condition of no charge on the 
twin wall is considered, but the (110) plane would 
probably be preferred as a consequence of interaction 
energies; but geometrical considerations permit only 
(110). In NaNbOs, which is certainly nonpolar," geo- 
metrical conditions alone favor (110) as the twin plane 
of the orthorhombic lattice. 

PbZrO; represents a particular case, since the length 
of the orthorhombic 6 axis is exactly twice that of the 
orthorhombic a axis. This renders (111) possible as a 
plane between twins with symmetrical extinction, on 
the basis of crystallographic considerations only. This 
twinning plane is observed in PbZrO; crystals, but its 
existence is by itself not sufficient to exclude polarity. 
It is only the existence of twinning planes of the general 
type (k0) (with h, k~1), which can be considered as 
experimental evidence for the nonpolarity of the ab 
plane. The (420) planes could not form domain walls, 
in the case of a ferroelectric crystal, even if they were 
allowable geometrically, because of the condition re- 
quiring no charge on the wall. This type of twinning 
plane has been observed in PbZrO; sections with sym- 
metrical extinction, proving the nonpolarity of the ab 
plane. 

Orthorhombic PbZrO; shows the following twinning, 
predominantly : 

(1a) Plates with extinction parallel to the edges 
twin as represented schematically in Fig. 4(a); here 
double arrows indicate the direction of the ortho 


Fic. 5. Twinning of PbZrO; crystals, showing parallel extinction. 
17 P, W. Forsbergh, Jr., Phys. Rev. 76, 1187 (1949). 
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Vv 
interference fringes 


Fic. 6. Twinning of PbZrO; crystals, showing parallel 
and symmetrical extinction. 


"The twin wall is parallel to a cubic (110) or ortho- 
thombic (111) plane. Often, wedge twins as in Fig. 4(b) 
are visible. This is the type of twinning observed in 
orthorhombic BaTiO;."!7 Figure 5 is a microphoto- 
graph of PbZrO; showing this twinning. 

(1b) Plates with mixed extinction positions (parallel 
and symmetrical) are twinned as represented in Fig. 
4(c). This is the same as described in paragraph (1a), 
but as seen from a direction perpendicular to that of 
Fig. 4(a). Figure 6 shows this photographically in 
PbZrO3. 

(2) Plates with symmetrical extinction (at 45° to 


ionfthe cubic edges) are twinned as represented in Fig. 


(3a), (b), (c). Figure 3(a) illustrates twinning in 
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PbZrOs, similar to that in BaTiO; crystals as already 
discussed. In Fig. 3(b), the twin boundary is an 
orthorhombic (111) plane, which is only possible in 
PbZrO; because of the peculiar relationship between 
the lengths of the orthorhombic axes (6=2a). This 
twin configuration is assumed on the basis of the follow- 
ing experimental evidence. The twin components show 
extinction at 45° to the edges; the slow rays of the two 
twins are perpendicular to each other, as can be seen 
by inserting a unit retardation plate above the crystal 
lying between crossed nicols; the two twins are sepa- 
rated by a region which, in the position of maximum 
light intensity and in white light, shows decreasing 
interference color fringes towards the black central line 
of the region itself. 

Finally, Fig. 3(c) shows twinning on an (/k0) plane. 
This type of twinning was observed less often than 
types (a) and (b), and is again only possible in PbZrO; 
because of the peculiar axial relationship and the non- 
polarity in the ab plane. Photographs of PbZrO; 
showing this twinning are given in Fig. 7. 


IV. OPTICAL PROPERTIES 


Orthorhombic untwinned crystals resulting from a 
small distortion of a cubic lattice can be divided into 


(b) 


Fic. 7. Twinning of PbZrO; crystals with symmetrical extinc- 
tion: (a) Interference fringes due to (111) walls are visible. 
(b) Other twin configuration of (a) after heating above the transi- 
tion point. 
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two types, according to their extinction properties (see 
Fig. 1): (1) sections showing extinction parallel to the 
cube edges (parallel); (2) sections showing extinction 
at 45° to the cube edges (symmetrical). 

Since these untwinned crystals are usually too small 
for study with conoscopic light, their properties can 
be analyzed as follows. Let ma, m», m- be the refractive 
indices of light waves vibrating along the crystallo- 
graphic directions a, 5, c. We wish to determine which of 
these indexes is the largest and which the smallest, i.e., 
which is y and which is a, where a<@<y7; this will 
establish the position of the axes of the indicatrix with 
respect to the crystallographic directions. The bire- 
fringence of the sections with symmetrical extinction 
is An,=”,—Mq; that of the sections with parallel ex- 
tinction is 

NaMN2 
(n-+-n,?)4 


Setting m=(1+2)n., where |x|<1, and neglecting 
terms of second order, 


Any,=Nn-—b=Ne— 


An,=n.—}(natm). 


The sign of An, can be determined experimentally in 
the following manner: for the sections showing mixed 
extinction [type 1(b) of the preceding section, Fig. 4(c) 
and the photograph in Fig. 6], the orthorhombic c axis 
must lie within the plane of the section and be per- 
pendicular to the trace of the twin wall on the same 
plane. By using a quartz wedge, it can be proved that 
this direction is that of the slow ray, so that it follows 
that 


Ne> 4 (mat nv) 


for both PbZrO; and NaNbO3. 

The sign of An, can be determined in the following 
way: consider a crystal showing two twins with sym- 
metrical extinction, as in Fig. 3(a). In NaNbOs, the 
angle distortion within the ab plane is 40’. The direc- 
tion of the a (or b) axis can be identified by measuring 
the extinction positions of the two adjacent twins, 
which is 40’ from 90°. Then, by using a quartz wedge, 
it can be determined whether the identified direction 
is the fast or the slow ray. It appeared that the b 
direction is that of the slow ray, and thus it follows that 


ny> Na. 


In PbZrO;, this method cannot be applied because 
2a=6; thus there is no measurable angle distortion 
within the ab plane. We therefore picked out an un- 
twinned crystal showing symmetrical extinction, and 
took an x-ray diffraction picture for rotation around the 
direction of the slow ray as identified with the quartz 
wedge. The measurement of the spacing showed that 
the rotation axis was the b axis, so that it follows again 
that 

Nb> Na. 
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The absolute magnitude of the birefringence Am, was 
examined by two methods. In Na light, interference 
fringes were produced in a wedge-shaped twin as 
shown in Fig. 4(c), the distance between the fringes was 
measured, the thickness of the crystal equivalent to a 
given path difference was calculated, and the bire- 
fringence was computed from the formula 


Ryup=10°X (my— Ma) Xd, 


where R,, is the relative retardation in wy, and d, is 
the thickness of the section in u. The second method 
was to measure the retardation R of a plane section by 
means of a calibrated quartz wedge (Na light), and 
the thickness d by means of the fine focussing adjust- 
ment on the microscope. The computation of (m»—n,) 
was then accomplished with the above formula. 

Only this second method can be used for the measure- 
ment of An, in PbZrOs, because in this case the bire- 
fringence is too small to produce interference fringes in 
the thin plates; but both methods served for NaNbO. 

The results of the measurements at room tempera- 
ture are given in the first two columns of Table III, 
which also contains the corresponding values of ortho- 
rhombic BaTiO;. In BaTiOs, it was shown!’ by optical 
observations under application of an electric field that 
N»>Na. We could prove also, by means of the quartz 
wedge, that n.>}(a+m»).'8 

We are now in the position to compute n,—maAn, 
+3An, for the three crystals considered. The results 
are given in the third column of Table III. The rela- 
tionship between 1, »,-, and a,8,y also appears 
clearly and is given in Table IV. It has to be seen that 
all three crystals are optically negative. — 

These conclusions were experimentally verified by 
the study of the largest possible untwinned crystals of 
PbZrO; with conoscopic light. The sections with sym- 
metrical extinction reveal an “optic normal” inter- 


TABLE III. Some optical values for orthorhombic 
perovskite-type crystals. 








Nand«/2 
(nat+-m2)§ = Ans =nb—MNo 
0.039 


0.080 
0.075 


Anp+4An, 
SNe —Na 


0.024 
0.124 
0.083 


Any =Ne — 





0.005 
0.084 


BaTiO,* 0.046 








® See reference 17. 


TABLE IV. Orientation of indicatrices for orthorhombic 
perovskite-type crystals. 








Na 





a 
a 
a 








18 The optical study of Kay and Vousden (see reference 12) 
under application of an electric field seems to imply ma>my>M. 
which is not in accordance with Forsbergh’s and our own results. 
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ference figure, whereas the sections with parallel ex- 
tinction appear to be almost normal to an optic axis. 
The optic axial angle does not differ much from 90°. 

Some of the optical properties of NaNbO; were 
studied by Wood.” The birefringence of the sections 
with symmetrical extinction was reported as 0.13, which 
is appreciably higher than our results. According to 
Wood, the orthorhombic c axis is the obtuse bisectrix. 
This last is in accord with our results, giving ac as the 
optic plane in NaNbO3. 

It is interesting to note that, for all of the crystals 
considered, mq is the smallest refractive index. The a 
axis is the direction in which the main ion shifts were 
observed, parallel in BaTiOs, and antiparallel in PbZrO; 
and NaNbO. Furthermore, the crystals are optically 
negative, which characterizes once more the direction 
of the spontaneous polarization of the original cubic 
cell with respect to the directions perpendicular to it. 
It may be recalled that in tetragonal BaTiO;, .<ma. 

The value of the refractive index of PbZrO; has 
been roughly estimated by using Chaulnes’ method for 
determining the ratio between true and optical thick- 
ness; this gives the result y~ 2.2. 

The temperature dependence of (y—a) and of 
[8—aryv2/(o?+-*)#]=(6—6) for PbZrO; has been 
measured with the quartz wedge, with the results shown 
in Fig. 8. 

Figure 9 shows the temperature dependence of the 


birefringences of NaNbO; crystals, showing parallel 
and symmetrical extinctions. Above 360°C, all NaNbOs 
crystals show parallel extinction, since the symmetry 
is tetragonal there. 


V. DISCUSSION 


The temperature dependence of the birefringence of 
PbZrO; and NaNbO; is of interest because it is related 
to the spontaneous strains which take place below the 
transition points, and to the spontaneous polarizations 
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. 8. Temperature dependence of birefringence 
of PbZrO; crystals. 


4 E. A. Wood, Acta Cryst. 4, 353 (1951). 
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Fic. 9. Temperature dependence of birefringence 
of NaNbO; crystals. 


of the substructures. The treatment of, the change in 
refractive index due to strain and polarization was 
first given by Pockels.” In the case of tetragonal BaTiO;, 
this treatment was shown to give satisfactory results!” ; 
the birefringence of the tetragonal phase can be ex- 
plained by the linear elasto-optic effect and by the 
spontaneous Kerr effect proportional to the square of 
the spontaneous polarization P,. Since it was shown 
that the spontaneous strain is proportional to P,?,” it 
was possible to establish a proportionality between the 
birefringence An and either the strain” or P,? alone. 

In principle, Pockels’ treatment of the elasto-optic 
effect can be applied to a transition like that of PbZrO; 
from cubic to orthorhombic. We assume that the follow- 
ing strains are introduced in the cubic PbZrO: 


Xc=Vyy Sz, Xy 


We thus obtain for the birefringence of sections with 
symmetrical extinction 


y— on parry, 


and for the birefringence of sections with parallel ex- 
tinction 


B—53n*(pu— pie) (w2—22), 


where is the refractive index of the cubic phase, and 
the p, are the elasto-optical constants. The above re- 
sults again neglect second order effects. 

The formula for (@—6) appears to be reasonable, 
since it can be seen (Fig. 8) that (@—6) behaves quite 
similarly to (v,—2,), i.e., in the first approximation, to 
(ao/co—1).% However, the behavior of (y—a), which 


*” F. Pockels, Lehrbuch der Kristalloptik (B. G. Teubner, Leip- 
zig, 1906). 

21 W. J. Merz, Phys. Rev. 76, 1221 (1949). 

2 A. F. Devonshire, Phil. Mag. 40, 1040 (1949). 

%3 FE. Sawaguchi, J. Phys. Soc. Japan 7, 110 (1952). 
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was found to be about eight times larger than (6—6) 
at room temperature, can hardly be explained by the 
above formula (assuming similar orders of magnitude 
and temperature behavior of all p;.’s) because the shear 
xy, of orthorhombic PbZrO; is extremely small, if it is 
present at all. 

In the case of NaNbO; the situation is similar: the 
value of the axial distortion (a/co—1), as measured by 
x-rays,’ diminishes about four times upon heating from 
room temperature to the transition point at 360°C. 
The birefringence of sections with parallel extinction 
shows the same behavior, and could therefore be ex- 
plained in principle by the strains x, and z, only. The 
birefringence of sections with symmetrical extinction 
shows the same behavior, whereas the angle distortion 
as measured by x-rays’ becomes only two times smaller, 
at the transition point, than at room temperatures— 
proving again that the elasto-optic effect alone cannot 
explain the experimental results. 

The above treatment is evidently incomplete because 
the spontaneous strains in the crystal are essentially 
related to the antiparallel ion shifts, which in turn 
create antiparallel polarizations. Pockels’ treatment of 
the elasto- and electro-optic effects is purely phe- 
nomenological and deals only with macroscopic stress, 
strain, field, and polarization. The treatment is valid 
for BaTiOs, at least to the first approximation, because 
the spontaneous strain and polarization of the unit 
cell are equal to the macroscopic strain and polariza- 
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tion. In PbZrO; and NaNbOs, the situation is more 
complicated since the unit cell of the orthorhombic 
lattice contains 8 or 16 unit cells of the original cubic 
lattice, and the distortions of the latter are in general 
different from the macroscopic deformation. The crystal 
as a whole has no net polarization; a correct treatment 
should imply the separate consideration of the sub- 
lattices, which are polarized in antiparallel directions, 
and their superposition. However, no satisfactory com- 
parison with the experimental data can be made unless 
we know the values of the elasto- and electro-optic 
constants involved. 

The crystal structure of PbZrO; as given by Sawa- 
guchi ef al.® presents the quite reasonable model of the 
antiparallel shifts of Pb ions along the a direction, and 
could also explain the large optical anisotropy in the 
ab plane. This structure, however, does not suggest 
why the lattice parameters are exactly 2a=b. It might 
be that some other ions (such as oxygens) shift simul- 
taneously in the 6 direction, so as to give a large optical 
and dielectric anisotropy within the ab plane while 
keeping b= 2a. In fact, both the space groups Pba2 and 
Pham proposed for PbZrO; allow any shift of ions within 
the ab plane except for one each of O, or O,. A more 
detailed structural study is necessary, if light is to be 
shed on this matter. 

The authors are greatly indebted to John McLaughlin 
for his assistance during the crystal preparation and the 
measurements. 





PHYSICAL REVIEW 


VOLUME 97, 


NUMBER 6 MARCH 15, 1955 


Generation of Electron Traps by Plastic Flow in Alkali Halides* 


Masayasu UETAT AND WERNER KANZIG 
Department of Physics, University of Illinois, Urbana, Illinois 
(Received December 2, 1954) 


Vacancy clusters, generated by moving dislocations, act as effective electron traps and change the photo- 
graphic properties of the alkali halides considerably. This has been investigated by using additively colored 
KCl and electrolytically colored NaCl. Electrons were released from the F-centers by irradiation of F-light. 
In a plastically deformed crystal these electrons are immediately trapped by vacancy clusters and have not 
much chance to return to a negative ion vacancy. Therefore, the F-band bleaches much faster than in the 
case of undeformed crystals. Measurements of the bleaching rates permit an estimation of the number of 
traps generated by plastic flow. About 10!” traps per cm? are formed by a plastic strain of 10 percent. The 
nature of the traps depends upon the time, which has passed after cold work. In the case of KCl, irradiation 
of F-light during plastic deformation gives rise to a pronounced enhancement of the M-band. Filling of the 
traps 48 hours after cold working results in the formation of a very broad band, centered near 800 my. The 
generation of F’-centers and the thermal conversion of F’-centers into F-centers are also strongly affected 


by the presence of these traps. 





A. INTRODUCTION 


T is well known that plastic deformation has con- 

siderable influence on the electrical and optical 
properties of the alkali halides. Rather spectacular 
effects are, e.g., the enhanced darkenability for x-irradi- 
ation and the temporary rise of the electrical con- 
ductivity. Seitz has shown in detailed discussions, that 
the generation of vacancies by moving dislocations is 
the key for the understanding of these effects. 

In a preliminary report the present authors have 
shown that the generation of vacancy aggregates can 
be demonstrated, by using the fact that these act as 
effective electron traps.? The present paper represents 
an attempt to estimate the number of traps generated 
by plastic flow and to investigate their nature. 


B. AMETHOD TO ESTIMATE THE NUMBER OF TRAPS 
GENERATED BY PLASTIC FLOW 

The basic idea underlying our method to estimate the 
number of traps is the following: In crystals containing 
essentially only F-centers (additively or electrolytically 
colored crystals), electrons are released by irradiation 
of F-light. These electrons have a chance to be captured 
by various kinds of traps. If, e.g., the trap is a negative 
ion vacancy, another F-center will be formed. All other 
trapping processes decrease the number of F-centers. 
Trapping in simple clusters of vacancies results in the 
formation of M-, R-, N-and more complex (unidentified) 
centers. The probability that an electron is captured by 
a given kind i of traps is proportional to the concen- 
tration m; of these traps times the capturing cross 
section o;.3 For a crude estimation we may assume that 

* Partially supported by the U. S. Office of Naval Research. 

tOn leave from the Department of Physics, University of 
Kyoto, Corning Glass Fellow. 

1F, Seitz, Phys. Rev. 80, 239 (1950); Advances in Physics 1, 
43 (1952). 

2M. Ueta and W. Kanzig, Phys. Rev. 94, 1390 (1954). 

For the sake of simplicity, we assume here that no metastable 
traps are present. This assumption holds for KCl at room tem- 
perature, but not for NaCl, in which case F’-centers are formed 


by irradiation of F-light. These are not stable at room temperature 
and decay with a half-life of about 10 minutes. 


the trapping cross sections o; do not differ vastly for 
similar simple clusters of vacancies. Therefore, the 
probability that an electron becomes trapped is roughly 
proportional to the total number of traps per cm*. The 
bleaching rate of the F-band (for a given initial con- 
centration of F-centers and a given intensity of irradi- 
ation) is therefore proportional to the concentration of 
unfilled traps other than negative ion vacancies. Once 
the traps are filled, bleaching becomes more difficult, 
for direct trapping is no longer possible. Bleaching 
curves of additively colored KCl show clearly a region 
of fast bleaching and a region of slow bleaching (curve I 
in Fig. 1). The decrease Aa of the F-absorption in the 
region of fast bleaching corresponds presumably to the 
number of traps initially present in the crystal. 

If additional traps are formed by cold work, the 
initial bleaching rate is enhanced. It may happen that 
the number of traps exceeds the initial number of 
F-centers. In this case, almost complete bleaching 
occurs in a short time (curve II in Fig. 1). The absolute 
concentration of traps cannot be evaluated from curve 
II alone. However, curve I permits evaluation of the 
proportionality factor between the absolute number of 
traps per cm* and the initial slope of the bleaching 
curve. This proportionality factor may be assumed to 
be valid for given experimental conditions, such as 
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Curve I: undeformed crystal 
Curve I: deformed crystal with 19% 
plastic strain 
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Fic. 1. Optical bleaching of the F-band in additively colored 
KCI crystals, measured at room temperature. Bleaching light: 
546 my, approximately 5X10 quanta sec™! cm™. 
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Fic. 2. (a) Formation of the M- and R-bands in an additively 
colored, undeformed KC] crystal by irradiation with F-light at 
room temperature. (b) Formation of a broad absorption band by 
F-irradiation at room temperature 48 hours after plastic defor- 
mation. The crystals were cut from the same colored piece and 
subjected to the same conditions of irradiation. The spectra were 
measured at liquid Nz temperature. , 


initial concentration of F- centers, intensity of bleaching 
light, and thickness of crystal. Hence, once this factor 
is established, the absolute concentration of traps can 
be evaluated from the initial bleaching rates. 

In NaCl the circumstances are somewhat more com- 
plicated, because of the formation of metastable 
F’-centers. It is necessary to wait, after irradiation with 
F-light, until the F’-centers have decayed, before the 
decrease of the F-band can be measured. 


C. MECHANISM OF OPTICAL BLEACHING OF THE 
F-BAND IN ADDITIVELY COLORED CRYSTALS 


The bleaching curve of Fig. 1 suggests that two 
different bleaching mechanisms exist : The fast bleaching 
corresponds probably to direct trapping of electrons in 
vacancy clusters already present in the crystal before 
irradiation. The slow bleaching is a more complex 
process. The traps are formed during irradiation, and 
migration processes are involved. 


(1) Bleaching of Undeformed Crystals 


The additively colored KCl crystals used in our 
experiments contained about 3X10'® traps per cm’. 
Presumably most of these originate from the quenching 
process which the crystals have to undergo after being 
heated in the alkali metal vapor. The vacancies origi- 
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nally present combine to neutral vacancy pairs. These 
are very mobile and coagulate, forming quartets. The 
latter do not migrate and are believed to be the final 
clusters. If such a quartet captures an electron, a posi- 
tive ion vacancy is ejected and an M-center is formed. 
This process is predominant in the region of fast bleach- 
ing. The trapping of a second electron in an M-center 
with the formation of R-centers may also occur to a 
certain extent. 

The slow bleaching is probably due to the following 
process: Positive-ion vacancies, which are left from the 
processes described above, or which are due to the 
presence of divalent impurities, migrate and join the 
optically ionized F-centers. New vacancy pairs and 
quartets can be formed this way, and the bleaching 
goes on. However, it is slow, because migration processes 
are involved. Figure 2(a) illustrates the formation of 
the M- and R-bands in an undeformed additively 
colored KC] crystal. 


(2) Bleaching of the Deformed Crystals 


The initial bleaching of the deformed crystal is much 
faster, as the number of traps is larger (Fig. 1). How- 
ever, there is not only a difference in the quantity of 
the traps but also in their quality. Moreover, the nature 
of the traps depends upon the time which has passed 
after cold work. 

(a) If an additively colored KCI crystal is irradiated 
with F-light 48 hours after plastic deformation, the 
absorption spectrum of Fig. 2(b) results: The M-band 
is not enhanced, and the R-bands are hardly detectable. 
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Fic. 3. Formation of a very prominent M-band in an additively 
colored KC] crystal by irradiation with F-light during plastic 
deformation (curve I). Growth of the M-band and decrease of 
the Re band after annealing at room temperature (curve II). 
Influence of the same irradiation 48 hours after plastic deformation 
(curve III). The spectra are measured at liquid Nz temperature. 


4 F. Seitz, Revs. Modern Phys. 18, 384 (1946). 
5 A. B. Scott and L. P. Bupp, Phys. Rev. 79, 341 (1950). 
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GENERATION OF ELECTRON 


A very broad absorption band is superposed on the 
M-band. At early and intermediate stages of F-irradi- 
ation this broad band grows very fast, whereas the 
M-band remains unchanged. There is no doubt that 
this broad band has to be attributed to electrons cap- 
tured in the traps which have been generated by the 
plastic deformation. Very probably, it may be identified 
with the band, which Scott and Bupp® have termed 
the R’-band. These workers obtained it by a combined 
heat and optical treatment. The nature of the corre- 
sponding color centers is believed to be composite and 
is not known.® 

NaCl shows a slightly different behavior.? The M- 
and R-bands are enhanced in addition to a broad band 
extending from the M-band to the F-band. This band 
might be due to colloids of very inhomogeneous size, or 
it has to be identified with the R’-band. 

(b) If the crystal is.irradiated with F-light during 
cold work, the absorption spectrum is entirely different 
from that discussed above. No R’-band is formed. The 
M-band is enhanced instead, and the R2-band is formed. 
Curve I in Fig. 3 shows the absorption spectrum of an 
additively colored KCl crystal which has been plasti- 
cally deformed under strong F-irradiation. A plastic 
strain of 20 percent was applied in 4 minutes. Then the 
light source was shut off, and the crystal cooled immedi- 
ately to liquid Ne temperature. For comparison, a piece 
cut from the same colored crystal, was subjected to 
the same plastic strain in the dark. Forty-eight hours 
after cold work it was irradiated with the same intensity 
and for the same time. Then the spectrum was again 
measured at liquid N2 temperature (curve III, Fig. 3). 

These experiments demonstrate clearly, that within 
the first 4 minutes of cold work a large number of 
quartets are present, which capture electrons and thus 
are converted into M-centers and R2-centers. 

If the crystal which has been irradiated during plastic 
deformation is annealed at room temperature for 48 





Curve 1: undeformed crystal 
I: 2.60% plastic strain 
: 4.38 eo ee 


13.5% 


ABSORPTION COEFFICIENT AT F- PEAK 
=] 











100 0 200 sec 
IRRADIATION TIME 


Fic. 4. Initial optical bleaching rates of the F-band in additively 
colored KCl crystals, measured at room temperature. Bleaching 
light 546 mp, approximately 5X 10" quanta sec"! cm. 


6 F, Seitz, Revs. Modern Phys. 26, 7 (1954). 
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Fic. 5. Concentration of electron traps in KCl 
measured 2 days after cold work. 


hours, the R2-band decreases and the M-band and the 
F-band grow (curve II, Fig. 3). Apparently the follow- 
ing reaction occurs: Ro-center+vacancy pair—M-center 
+F-center. 


D. OPTICAL BLEACHING RATES OF THE F BAND 


From the foregoing considerations it is clear that the 
initial bleaching rate of the F-band is a measure of the 
concentration of stable traps (other than negative ion 
vacancies) initially present in the crystal. From the 
fast decrease of the (ionic) conductivity during the first 
few minutes after cold work,! one concludes that the 
number and the nature of the vacancy aggregates 
changes rapidly. Bleaching rate measurements take at 
least a few minutes. Therefore, they cannot yield con- 
clusive results if carried out immediately after plastic 
flow. All bleaching rates were measured about 2 days 
after cold work. 


Experimental Procedure 


Synthetic KCl crystals (Harshaw) were uniformly 
colored by heating in Na vapor at 530°C for 48 hours. 
The colored samples were plastically deformed in the 
dark by means of an elastic clamp. The rate of com- 
pression was about 10 percent per 20 minutes. The 
surface of the deformed samples was polished with a 
wet cloth. All crystals to be compared were cut from 
the same colored piece, ground to the same thickness 
and subjected to the same surface treatment. The 
bleaching; light was the green mercury line \= 546 my. 
Approximately 5X 10" quanta sec! cm~? were incident 
on the crystal. The intensity of the monochromatic 
light, which served to measure the absorption at F-peak, 
was at least two orders of magnitude smaller, and its 
influence could be neglected. Figure 4 shows a typical 
result of a measurement of initial bleaching rates. 

The relation between the initial slope of the bleaching 
curve and the absolute concentration of traps was de- 
termined by the procedure described in Sec. B. For the 
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Fic. 6. Concentration of electron traps in NaCl 
generated by plastic deformation. 


undeformed crystals we obtained trap concentrations 
ranging from 2.7510" to 2.85 10'* cm. In Fig. 5 
the total concentration of traps is plotted versus plastic 
strain. 

Similar experiments were carried out with NaCl 
crystals. As it is very difficult to produce F-centers in 
NaCl by heating in Na vapor without considerable 
coagulation to colloids, we colored these crystals by 
injection of electrons from a pointed cathode at 530°C. 
The regions of fast and slow bleaching are not so clearly 
separated in undeformed crystals. However, a small 
plastic strain is already sufficient to obtain a distinct 
separation. Therefore, we determined the proportion- 
ality factor between the initial bleaching rate and the 
absolute concentration of traps, using crystals with 2 
percent strain. The concentration of F-centers in the 
different samples did not differ considerably, and the 
same proportionality factor could be used for all 
samples. Figure 6 summarizes the results. 


Discussion of the Results 


There is no significant difference between KCl and 
NaCl with regard to the number of traps generated by 
cold work. Ten percent strain yields 0.710" traps per 
cm® in KC] and 0.5X10"" traps per cm* in NaCl. 

The number of traps is related to the number of 
single vacancies originally generated by the moving 
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Fic. 7. Bleaching of the F-band in additively colored KCl by 
plastic deformation at. room temperature. 








dislocations. In order to establish this relation, definite 
knowledge of the nature of the traps is necessary. In the 
case of NaCl, the absorption spectrum reveals, that 
M- and R-centers are formed in addition to R’-centers, 
The former two centers are believed to contain two 
negative ion vacancies and one or two electrons® and 
not more than one positive ion vacancy. The character 
of the R’-centers is composite and not known. However, 
it is reasonable to assume that it is similar to the 
character of the M- and R-centers. Therefore one or 
two electrons are trapped per each pair of negative ion 
vacancies. On the other hand it is likely that equal 
numbers of positive and negative ion vacancies are pro- 
duced. Therefore the total number of single vacancies 
is 2 to 4 times the number of traps. Thus, we may 
conclude that 1X10!” to 3X10!" single vacancies per 
cm*® are produced by a plastic strain of 10 percent. 
These numbers are in good agreement with Seitz’s 
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Fic. 8. Influence of cold work on the thermal conversion of 
F’-centers into F-centers in NaCl at room temperature. Growth 
of the F-band versus time. The dotted curve corresponds to the 
behavior of the undeformed crystal. 
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interpretation of the enhanced conductivity observed 
by Gyulai and Hartly.’ 


E. BLEACHING OF THE F BAND BY PLASTIC 
FLOW ONLY 


We -expect that plastic flow without subsequent 
irradiation of F-light has an influence on the absorption 
spectrum of additively or electrolytically colored crys- 
tals. Two essentially different mechanisms may be 
effective: (1) Moving dislocations generate local heat- 
ing. The electron of an F-center can be thermally 
released, when a dislocation passes very close. This 
electron may become captured by one of the traps 
generated by the moving dislocation. (2) Vacancies and 
vacancy pairs, generated by the plastic deformation, 
migrate and join F-centers, transforming these into 
M-centers, R-centers, or more complex aggregates. Both 
mechanisms result in a bleaching of the F-band. Our 
experiments with additively colored KCl crystals show 
that a relatively small bleaching effect exists. In Fig. 7 
the percentage of F-centers destroyed is plotted versus 


7Z. Gyulai and D. Hartly, Z. Physik 51, 378 (1928). 
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plastic strain. The curve exhibits a pronounced satura- 
tion behavior above 10 percent strain. In order to 
decide which of the two proposed mechanisms is more 
effective, the bleaching effect of cold work at liquid Ne 
temperature was also investigated. Migration is neg- 
ligible in this case. The decrease of the F-band is of 
the same order as at room temperature, indicating that 
mechanism (1) is at least predominant. It seems that 
the vacancies generated by the moving dislocations are 
formed in a coagulated manner or coagulate into 
immobile clusters before they have a chance to meet an 
F-center. This in turn suggests that very high local 
concentrations of vacancies are generated. 


F. INFLUENCE OF PLASTIC FLOW ON THE 
F’-BAND 


1. Formation of the F’-Band 


An F’-center consists of an F-center which has 
captured a second electron. It can be formed in addi- 
tively or electrolytically colored crystals by irradiation 
of F-light at appropriate temperatures. 

In a plastically deformed crystal, however, an elec- 
tron which is optically released from an F-center has 
considerable chance of being trapped in a vacancy 
cluster instead of being captured by an F-center. We 
found indeed that no F’-band can be formed in addi- 
tively colored KCl by irradiation of F-light at dry ice 
temperature if the crystal previously had been sub- 
jected to plastic strain. 


2. Bleaching of the F’-Band 


F’-centers are metastable in NaCl at room tempera- 
ture. An electron is thermally released and an F-center 
is left. In an undeformed crystal nearly all the thermally 
released electrons return to a negative ion vacancy and 
two F-centers are formed for every decaying F’-center.! 
This reaction is considerably perturbed if the crystal is 
cold worked. The growth rate of the F-band is smaller 
after plastic deformation (Fig. 8). Two different effects 
contribute to this decrease: 
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(F'band saturated) 
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Fic. 9. Influence of cold work at dry ice temperature on the 
absorption spectrum of an electrolytically colored NaCl crystal 
containing F’-centers. 


(a) F’-centers have been destroyed during cold work 
by local heating. Thus the concentration of decaying 
F-centers has been reduced. 

(b) Part of the thermally released electrons are 
captured by vacancy clusters. Therefore less than two 
F-centers are formed for every decaying F’-center. 

Process (a) was investigated separately with the 
crystal held at dry ice temperature when thermal 
freeing of electrons from F’-centers is negligibly slow 
(Fig. 9). The F’-band is partially bleached by cold 
work, and the M-band is enhanced. This confirms the 
conclusions made in Sec. C: The vacancies, originally 
generated by moving dislocations, coagulate to pairs 
and quartets within a few minutes. Process (b) could 
be observed during warming up: The F’-band bleaches 
completely, and the M-band grows (Fig. 9). 

The authors wish to thank Professor R. J. Maurer 
and Professor F. Seitz for advice and stimulating dis- 
cussions. 
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Studies are reported on persistent internal polarization effects. 
This polarization can be produced, particularly in photocon- 
ductive, fluorescent substances of high dark resistance, by the 
action of various kinds of radiation in the presence of a dc electric 
field. It has been found that such polarizations, of more than 
10 000 volts/cm, will persist for many days if kept in the dark 
after field removal. A material so polarized is in many ways a 
photosensitive electret. Among the substances tested, a [Zn:Cd]S 
phosphor and anthracene were used most extensively. Measure- 
ments are reported on the effect of ultraviolet, visible, and infrared 
light, gamma and beta rays on the production and removal of 
polarization. The increase of polarization with time is initially 
rapid and then shows saturation. It is almost a logarithmic func- 
tion of the exciting radiation intensity, and a linear function of 
the polarizing voltage over a wide range. It is found that the 
equilibrium values of polarization are essentially determined by 


the applied voltage, and one parameter specific to the substance. 
Data are given on the long-time storage of polarization. It is 
shown that this polarization is due to a partial separation by the 
applied field of free mobile charges produced by the radiation 
inside the material and their localization in traps. It is not a 
charge injection-ejection phenomenon at the electrodes since it 
occurs just as well in samples insulated from the electrodes. In 
powders at least, the polarization is distributed throughout the 
entire sample and is not a charge accumulation near the surfaces. 
A correlation is established between these phenomena and the 
mechanism assumed for these substances to explain photocon- 
ductivity and fluorescence. A phenomenological model is presented 
which quantitatively describes many of the results of these 
experiments. This polarization effect provides a new method to 
detect and study energy storage in crystals. 





I. INTRODUCTION 


INCE the earliest days of research on photocon- 
ductivity it has been recognized that a space charge 
type of polarization occurs in insulating photocon- 
ductive substances and persists after field removal.! 
Generally it was believed to arise out of the immobility 
of the positive charges and the trapping of electrons 
moving in the conduction band, or their actual removal 
from the crystal by the field without a sufficient supply 
of new charges from the electrodés. The classical method 
of detecting and measuring polarization was to remove 
the applied field and flood the crystal with red or infra- 
red light. If the crystal had been polarized, a reverse 
current (to the normal photocurrent) would flow.! The 
magnitude of this reverse current was taken as a meas- 
ure of the extent of polarization in the sample. 

The effect of primary interest in those investigations 
was photoconductivity. It was natural, therefore, that 
polarization and the change in effective field strength 
due to it was considered an inherent interfering effect. 
Much of the current work? on this subject is pointed 
towards accurately evaluating the influence of polariza- 
tion on photocurrents in order to subtract the change it 
introduced in the measurements? ; or to perform the 
experiments in such a manner that the deviations due 
to polarization were minimized, for instance by flooding 
the sample with red or infrared light concomitant with 
the exciting radiation, or to make the measurements of 


* This work was supported by the Signal Corps Laboratories, 
Fort Monmouth, New Jersey. 

} Part of a dissertation presented to the Physics Department, 
New York University, in partial fulfillment of the requirements 
for the Ph.D. degree. 

1 E.g., R. Hilsch and R. W. Pohl, Z. Physik 87, 78 (1933). (For 
?-_ experiments see references in 10.) 

E.g., A. Rose, R.C.A. Rev. 12, 362 (1951). 
oR C. Herman and R. Hofstadter, Phys. Rev. 59, 79 (1941). 
4 J. J. Dropkin (unpublished thesis, June, 1947). 


photocurrents using either short-time flash illumination® 
or short-time, low-intensity radiation.® 

Experiments of more recent vintage in the field of 
crystal conduction counters have encountered polariza- 
tion in the form of a progressive deterioration of pulse 
heights with increasing number of counts. This was 
ascribed to the decrease in effective field strength as 
the polarization builds up.” 

Three major methods have been employed to deter- 
mine polarization: the measurements of a reversed 
current with various modifications, the use of a poten- 
tial probe moved over the lateral surfaces of the 
crystals,*® and the observation of the movement of 
color center clouds.*!° 

This paper presents an investigation of the internal 
polarization effect itself, especially of that portion 
which persists for long periods of time, without any 
view to its special impeding effects on current flow; 
indeed, it will be shown, that the formation of the 
polarization is largely independent of the flow of an 
actual body photocurrent and can be performed to the 
same extent with the sample between insulating sheets 
of mica, which, while displaying the well-known di- 
electric charging, blocks a steady state photocurrent. 
It deals with the formation and removal of polarization 
by various kinds of radiation; with its persistence, and 
its natural decay. The main result is that this persistent 
internal polarization is a much more important and 
outstanding effect than anticipated up to now. Some 
materials are actually equivalent to the well-known 


oN. F. Mott, Proc. Phys. Soc. (London) 50, 200 (1938). 
¢M. F. Manning and M. E. Bell, Revs. Modern Phys. 12, 215 

(1940), p. 226. 

TR. Hofstadter, Nucleonics 4, 11 (1949); H. Kallmann and 
R. Warminsky, Ann. Physik 6, 4 4’ (1948 ). 

8 R. H. Bube, Phys. Rev. 83, 393 (1951). 

°R.W. Smith, R.C. A. Rev. 12, 352 (1951). 

A, L. Hughes, Revs. Modern Phys. 8, 300 (1936). 
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electrets" and display polarization charges up to 10-° 
coulombs/cm*. The best of these materials were a 
(Zn:Cd)S phosphor (henceforth referred to as K) and 
anthracene. They are both capable of sustaining polar- 
ization field gradients of more than 10 000 volts/cm for 
long periods of time (see Sec. VII) and furnish output 
potentials far above 100 volts. The creation and dis- 
charge of these polarizations are highly radiation- 
sensitive. 

The measurements reported are preliminary and the 
work of collecting new data is still in progress. They 
already indicate that much insight into the nature of 
the electrical, optical, and storage properties of photo- 
sensitive insulators can be gained from such an ap- 
proach. 


II. EXPERIMENTAL METHODS AND APPARATUS 


Two methods have been used to measure the per- 
sistent internal polarization in a sample: the radiation 
discharge method, which is a nonrepeatable procedure, 
destroying the polarization at least partially, and the 
lifted electrode method which permits repeated meas- 
urements without appreciably disturbing the polar- 
ization. 

After a thin (0.1 mm thick) sample of the material 
to be investigated has been polarized (as indicated in 
Fig. 1) and both its surfaces grounded thereafter, an 
internal potential in the sample exists which is balanced 
by a voltage drop of opposite direction between the 
polarization charges and the image charges at the 
surfaces of the electrodes; the total voltage drop across 
the electrodes is zero. If one electrode is then attached 
to the grid of an electrometer tube, the ground con- 
nection removed, and the sample irradiated with various 
kinds of radiation, a decrease of the polarization occurs 
and a portion of the image charges will be freed. Thus a 
voltage increase V (termed the radiation discharge 
voltage) is produced at the grid of the electrometer, 
and from V the total polarization stored in the powder 
can be computed (see Sec. VIII). Thus by measuring V, 
the build up, the decay and the discharge of polarization 
can be determined. Since the radiation discharge 
method, however, destroys the polarization of the 
sample, the build up and particularly the decay of 
polarization can be determined by this method if, after 
each V reading, the sample is thoroughly depolarized 
and then polarized again in the same manner as before. 
The results of such measurements are reproducible to 
within less than 5 percent. 

The second method, derived from electret work, uses 
a brass electrode (1 cm? area) on the sample surface 
which can be lifted away from the surface after the grid 
is connected and the ground removed. The image 
charges (Q) induced on the electrode by the polarization 
across the sample will all appear across the grid capaci- 

1G. Nadjakoff [Compt. rend. 204, 1865 (1937)] describes a 


photoelectret effect in sulfur, similar to the phenomena described 
in this paper. 
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Fic. 1. Electrode and radiation arrangement of sample, showing 
direction of applied and polarization fields. 


tance (C,) if the electrode is lifted (C,, the capacitance 
of the unpolarized sample, goes to zero), and will 
produce a voltage (V.) on the grid. Then we have the 
relation, Q2=C.V., or 


Vo=(C./Cy) Ve, (a) 


where Vy is an equivalent potential which would appear 
across the sample if only the sample itself and the 
charges Q were involved. We define V, by Eq. (a) 
since the charge Q appears only across C, and not also 
across C, when the electrode is in contact with the 
sample. 

Extensive measurements have shown that the field 
changes during the lifting process does not appreciably 
affect the polarizations investigated, and that the re- 
sults obtained with both methods are equivalent. 

For substances in the form of fine powder the samples 
were prepared in the manner previously described” but 
with a larger amount of Duco cement. When a suffi- 
cient thickness (5 to 50 mg/cm?) was deposited, it was 
dried by an infrared baking lamp and was, when cooled, 
uniform and compact. 

In the case of anthracene (scintillation-grade flakes 
obtained from the Reilly Tar Company), fused solid 
samples and single crystals of thicknesses between 0.05 
and 0.3 mm were tested in addition to powdered 
samples. 

The samples were placed between two electrodes; 
one was the conducting coat of a Nesa’ glass plate and 
the second was a small square of flexible aluminum leaf 
in the center of the surface of the substance. A small 
brass weight was placed over this. When a voltage was 
applied across the sample, the aluminum leaf was 
drawn tightly down making intimate contact with the 
surface. Normally a dc field (termed polarizing voltage) 
is applied with concomitant radiation to polarize the 
sample. After a given period of time the radiation is 
removed, and then the field is turned off. The two elec- 
trodes are put to ground for various lengths of time and 


12 See H. Kallmann and B. Kramer, Phys. Rev. 87, 91 (1952), 
p. 93 


13 Commercial form of glass plate with evaporated transparent 
conductive coating obtained from Pittsburgh Plate Glass Com- 


pany. 
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TABLE I. Radiation sources. 








Wavelengths 


Radiation 
region Source (or energy) Intensity* 





3660 A 15 microwatts/cm? > 


4000-7000 A 


Ultraviolet light Mercury dis- 

Visible light 

Infrared light 

Mercury flash 
tube 


charge bulb 
— tungsten ca 1 milliwatt/cm? > 


it 
250-watt heat- 0.8 to 2 microns 15 milliwatt/cm?> 
ing lamp 
FT-214 tube uv—visible- 5 X10 ergs/cm? > 
infrared 
1 millicurie 


10 millicuries 


Radium 


Gamma rays 
Strontium 90 


Beta rays 








® Geometry factor different for each source. 
b At the sample surface. 


then a polarization measurement is performed as de- 
scribed above. The polarizing process can also be per- 
formed between insulating sheets; the determining 
factor is the external field across the sample. In this 
case after polarizing, for the purpose of measurement, 
one or both insulating sheets are removed when the 
lifted electrode method is used. 

Exciting or releasing radiations in the form of ultra- 
violet (3660 A), visible and infrared (up to 2 microns) 
light were incident on the sample through the trans- 
parent glass side and high energy radiations (gamma 
and beta rays) impinged on the sample through the 
thin aluminum foil as shown in Fig. 1. The various 
radiation sources are tabulated in Table I. 

The electrometer system used to measure polarization 
voltages is shown in Fig. 2. It should be noted that no 
external grid resistor is used. The cathode is biased 
positive with respect to ground so that the grid floats 
at ground with a minimum grid current being drawn 
(ca 10-“ amp). To this system, one of the electrodes 
(usually the metal electrode) was connected, whereas 
the other electrodes (Nesa glass) was kept at ground. 
Since the maximum polarization voltages to be meas- 
ured are well above 100 volts and since the cut-off 
voltage of the tube is about 6 volts, it was necessary to 
introduce two variable air condensers as voltage di- 
viders. The entire grid input assembly was enveloped in 
a grounded electrostatic shield. All parts of the appa- 
ratus exposed to higher voltages were made of poly- 
ethylene. Therefore, it was the Nesa glass that was 


usually kept at ground to avoid leakage across the glass * 


sides. Using usual precautions the apparatus was free 
of drift, kept a negative applied voltage long enough 
for accurate measurements, was easily calibrated and 
held its calibration for many months of operation. The 
minimum detectable voltage is 10 millivolts; the maxi- 
mum voltage capable of being registered is 150 volts. 
The substances tested up to now are described in 
Table II. No polarization means that the persistent 
effect was not detectable with described methods. 
These scanty results indicate that the substances show 
either both noticeable photoconduction and polarization 
or neither. This means that the production of mobile 
charge by exciting radiation is necessary for polariza- 
tion; but obviously, this is not a sufficient condition. 


This can be seen from the results with pure ZnS and 
CdS. Both show little photoconductivity, but one 
knows from conduction counting experiments with pure 
CdS that various kinds of radiations produce mobile 
charges in unactivated CdS." The small photoconduc- 
tivity in these substances is attributed to the lack of 
localized charge (and consequently a small carrier 
lifetime) ; ie., they contain no activators to make the 
positive charge remain fixed in the substance. There- 
fore no photoconductivity and also no persistent polar- 
ization can develop. For high photoconductivity at 
least a partial localization of one kind of charge is 
necessary, but it is not enough to bring about per- 
sistent polarization. For this a partial localization or 
trapping of both charges is imperative since otherwise 
the kind of charge which is not trapped at all could 
always recombine with the localized charge when the 
external field is removed. From this viewpoint it would 
be interesting to see whether substances with long time 
storage properties display an exceptionally high polar- 
ization. Our investigations already show that the 
reverse does not hold. The phosphorescence of anthra- 
cene, which was specially tested, is extremely poor 
compared to that of the (Zn:Cd)S phosphors, and in 
spite of this it displays a very strong persistent polar- 
ization. The reason for this lack of correlation probably 
originates in the small number of displaced charges 
required to bring about a very appreciable polarization. 
Not more than 10” (+) and (—) trapped charges per 
cm? in a layer of yo-mm thickness are sufficient to 
produce a polarization voltage of 100 volts. 

It appears obvious that in order for polarization to 
persist, the number of mobile charges after the removal 
of the exciting source must decay fast to a low value. 
Thus in all substances without a very high dark re- 
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Fic. 2. Schematic diagram of the measuring and 
recording system. 


4H. Kallmann and R. Warminsky, Ann. Physik 6, 4 (1948). 
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PERSISTENT INTERNAL POLARIZATION 


sistance, polarization may occur under an applied field, 
but will not persist after field removal. The high dark 
resistance of the (Zn: Cd)S phosphors and of the organic 
phosphors is thus essential for the long persistent 
polarization. 

It may be that some of the substances showing no 
polarization under ultraviolet excitation can be polar- 
ized with higher excitation energies. This is however, 
doubtful since anthracene, with a long wave absorption 
cut off at 3900 A, can still be polarized with light 
beyond 5000 A. A more detailed investigation of the 
polarization, of its build up, its decay, removal, and its 
persistence was primarily studied in the substances K 
and anthracene. 


III. POLARIZATION AS FUNCTION OF 
EXPERIMENTAL VARIABLES 


A. Polarization as a Function of Polarizing Voltage 
and Polarizing Time 


Most of the evidence accumulated to the present 
indicates that the internal polarization in a sample 
increases fairly linearly with the applied polarizing 
voltage; but such a dependence is only sensible when 
the dependence of the polarization of the polarizing 
time is taken into account. This buildup of polarization 
with time is characterized by an initial rapid increase 
followed by a gradual slowing down of the rate of 
increase until an equilibrium value is reached and no 
further increase occurs. This build up is of the form 
P=Pmax[ 1—exp(—t/r) ]. This slowing down of the rate 
of increase probably results from the fact that as the 
polarization builds up, the effective polarizing field on 
mobile charges decreases. (See Sec. VIII.) This type of 
rise of polarization is illustrated in Figs. 3 and 4 for 
anthracene and K respectively. These data were taken 
using a fairly high intensity light source of excitation 
(ca 15 microwatts/cm*). If the exciting intensity is 
reduced, the time to reach the maximum polarization 
increases. A series of measurements on anthracene shows 
this quite clearly; the applied field was kept constant, 


TABLE II. Materials tested for persistent internal 
polarization (P.I.P.). 








Exhibits 
pro- Exhibits 
Exhibits nounced __pro- 
pro- photo- nounced 
nounced conduc- fluores- 
P.I.P. tivity cence 


Substance Physical state 





Powder 

Powder 

Powder 

Powder 

Powder 

Powder 

Fused polycrystal 
Single small crystal 
Fused polycrystal 
Powder 

Powder 

Single large crystal 
Powder 


K (Zn:CdS) 

M (ZnS) 

N (ZnS) 
n: 


1; 
2. 
3. 
4, L (Zn: CdS) 
5. LG2150 (ZnS) 
6, Anthracene 
7. Anthracene 
8. Anthracene 
. Fluoranthene 
0. Chrysene 
. 9-Bromoanthracene 
. Trans-stilbene 
. ZnS—nonactivated 
CdS—nonactivated Powder 
Paraffin Fused sample (thin) 
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Fic. 3. Lifted electrode and radiation discharge voltages in 
anthracene as function of the polarizing time (polarizing voltage 
= 20 volts; ultraviolet intensity = 15 microwatts/cm?). 
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100 200 300 
Fic. 4. Lifted electrode and radiation discharge voltages in K 

as function of the polarizing time: (a) radiation discharge voltage; 


(b) lifted electrode voltage (polarizing voltage= 100 volts; ultra- 
violet intensity = 15 microwatts/cm?). 


while for several intensities of exciting radiation, the 
polarization build up versus time was determined. The 
results are plotted in Fig. 5. For very low intensities 
the slope is small and almost linear in time. As the 
intensities increase, the initial slope increases (though 
at a much slower rate and the linear portion of the 
curve is limited to smaller periods of time. It may be 
assumed that if the polarizing time was increased in- 
definitely for each curve, the final equilibrium values of 
the polarization would be identical regardless of the 
intensity of excitation. 

Further, experiments with very short excitation times 
were carried out with both K and anthracene. For 
anthracene, 300 volts was applied for 15 seconds. During 
this time a light flash from a FT-214 mercury flash 
tube was incident on the sample which was otherwise 
kept in complete darkness. The flash time is of the order 
of 1/2000 second and the energy incident on the sample 
was 5.3X10! ergs/cm?. The voltage V. amounted to 
slightly more than 70 volts. This is smaller than that 
obtained if the same total amount of energy would be 
applied during a longer period. With a quenched sample 
of K and an applied field of 200 volts the same experi- 
ment also produced a polarization larger than 70 volts. 
This shows that even with short excitation periods 
rather large polarization voltages can be produced. 
Anthracene was also polarized with an exposure to 15 
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lifted electrode voltage (Ve) 





polarizing time (minutes) 





24 6 8 0 2 4 6 
Fic. 5. Polarization in anthracene as function of polarizing 
time for various ultraviolet intensities: (a) 2.210-® watt/cm?; 


(b) 4X10 watt/cm?; (c) 7.3X10- watt/cm?; (d) 1.3x10-" 
watt/cm?; (e) 1.4 10-" watt/cm? (polarizing voltage~300 volts). 


microwatts/cm? of ultraviolet radiation for 1/25 second. 
A voltage of 300 volts was again applied for 15 seconds. 
This yielded a voltage V, of 37 volts. The energy ab- 
sorbed by the 1/25-second exposure is less than the 
_ energy absorbed by the 1/2000 second flash by a factor 
of about 1000, and yet the polarizations produced 
differ by a factor of only 2 or 3. This is roughly in 
agreement with the curves of Fig. 5 if the much smaller 
absorption of the energy emitted by the flash bulb than 
that of the ultraviolet radiation is taken into account. 
It should also be taken into account that the polariza- 


radiation discharge voltage (¥) 





polarizing voltage 





100 200 300 400 500 600 


Fic. 6. Maximum polarization voltages discharged by a single 
light irradiation and subsequent repetitions, in anthracene as 
function of the polarizing voltage. (Polarizing time=1 minute; 
ultraviolet intensity= 15 microwatts/cm*.) 


tion obtained depends not only on the time of irradiation 
but also on the period of time the field is applied ; for the 
same periods of irradiation the polarization would be 
larger when the field is applied for longer periods due to 
retrapping. The relationship between the intensity of 
the exciting radiation and polarization will be more 
fully discussed in the next part. 

To determine the dependence of the polarization on 
the polarizing voltage, all measurements should be 
made for equilibrium polarization. But it was observed 
that the results are not too different if instead one com- 
pares polarization values taken after a definite shorter 
polarizing time. 

Figures 6 and 7 present (V), Figs. 8 and 9, V>», and 
Fig. 10, V. versus polarizing voltage. Figure 10 departs 
from the usual linearity and shows a parabolic shape. 
This special shape may be due to the much lower 
intensity of ultraviolet used in this case. Figure 9 shows 
that at low polarizing voltages V, equals or even exceeds 
the polarizing voltage. 

The results should not be intercompared since they 
differ by different ratios of C./C, for V, in different 


radiation discharge voltage (¥) 








100 180 200 250 300 360 400 450 500 
polarizing voltage 
Fic. 7. Maximum polarization voltages discharged by a single 
light irradiation and subsequent repetitions, in K as function of 


the polarizing voltage. (Polarizing time=1 minute; ultraviolet 
intensity= 15 microwatts/cm?.) 


samples. Generally V. and V, are of the same order of 
magnitude. It should be further mentioned that the 
absolute polarization values may also depend upon the 
thickness of the samples. The voltages V are usually 2 
or 3 times smaller than V». It is to be expected that at 
very high voltages the polarization increase will be 
slowed down because of the saturation of the number of 
traps available for polarization. This may be true for 
anthracene, but for inorganic substances a simple calcu- 
lation shows that saturation should not be expected at 
the polarizing voltages used in these experiments since 
the number of traps is of the order of 10'* per unit 
volume which is much more than the number of charges 
required for polarizing. 


B. Polarization as Function of the Intensity of 
Various Exciting Radiations 


The voltage, V., in anthracene as a function of the 
exciting intensity is given in Fig. 11 for a range of 
polarizing times and for excitation with 3660 A, which 
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is strongly absorbed by the bulk material. The following 
major results were obtained: (1) polarization of the 
order of volts can be created with exceedingly low in- 
tensities of ultraviolet radiation (10-" watt/cm? and 
less); (2) the polarization is a very slow (almost loga- 
rithmic) function of the exciting intensity; (3) a reci- 
procity law does not hold for this case, which implies 
that the absorbed energy is not equally well utilized for 
producing polarization at different intensities. 

Light of wavelengths longer than the absorption edge 
of the substances (3900 A in anthracene and 4200 A 
in K) is also capable of producing polarization. The 
excitation by visible light may be at first sight astonish- 
ing for anthracene since no excitation to fluorescence 
has been reported in this instance. However, it was 
found that visible light is also capable of exciting phos- 
phorescence and even a small photoconductivity in 
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lifted electrode voltage (Vp) 


T 


polarizing voltage 
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Fic. 8. Lifted electrode voltage in K as function of the polarizing 
voltage (polarizing time=5 minutes; ultraviolet intensity = micro- 
watts/cm?), 


anthracene. For K the polarizing effect of visible light 
is probably due to the absorption of this light by the 
activator atoms and a production of mobile electrons. 
In both substances the amount of visible light necessary 
to produce polarizations comparable to ultraviolet 
polarization is much larger than that of the ultraviolet 
because of the much smaller absorption of this light. 
But these small amounts of light absorbed are sufficient 
to produce polarization. Measurements were further 
performed to determine up to which wavelengths polar- 
ization can still be produced. For this purpose a sample 
of K was first thoroughly de-excited with infrared light 
for 20 minutes to avoid any noticeable dark polarization 
and then polarized under visible light using various 
filters to cut off the radiation at progressively longer 
wavelengths with each succeeding measurement. The 
results are given in Table III. With increasing wave- 
lengths the polarization decreases, but is still consider- 


polarizing voltage 





lifted electrode voltage (Vv, ) 


60 70 80 





10 20 30 40 50 


Fic. 9. Lifted electrode voltage in anthracene as function of the 
polarizing voltage (polarizing time=1 minute; ultraviolet in- 
tensity= 15 microwatts/cm?). 


able at 6100 A and vanishes only beyond 7000 A. Since 
K has a fluorescent spectrum which reaches beyond 
6000 A, this supports the idea that this polarization is 
due to an absorption in activator levels. It will be 
shown in Sec. VI that visible light is also effective in 
discharging the polarization with an efficiency of about 
10 percent of that of ultraviolet light due to the smaller 
absorption coefficient for visible light. 

Polarization can also be produced in both substances 
using high energy radiation, e.g., 1-Mev electrons as an 
excitation source. The irradiation was equivalent to 
about 1 r/sec or 2 erg/sec energy absorbed in the 
sample. With a polarizing voltage of 200 volts applied 
to K for 10 and then for 20 minutes, V, voltages of 64.1 
and 93.5 volts respectively were obtained. Under ultra- 





lifted electrode voltage (V,) 


polarizing voltage 








200 400 600 

Fic. 10. Lifted electrode voltage in anthracene as function of 
polarizing voltage for low ultraviolet exciting intensity (polarizing 
time=5 minutes; ultraviolet intensity=4X10~ watt/cm?). 
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uv intensity (watts /cm?) 





“HI 7 Ifted electrode voltage (Ve) 
10 20. 40 60 80 100 
Fic. 11. Lifted electrode voltage in polycrystalline anthracene 


as a function of the exciting ultraviolet intensity for different 
polarizing periods. (Polarizing voltage ca 300 volts.) 





violet irradiation of 150 ergs/sec absorbed, polarizations 
slightly over 100 volts in a 5-minute polarizing time 
resulted. 

With K, the increase of polarization by fast electrons 
with polarizing time was measured by both methods. 
In Table IV the results are given for polarizing periods 
of 1, 5, and 10 minutes. Column 2 of the table shows a 
larger increase in polarization for the 5 to 10 minute 
period than the 1 to 5 minute period. This is unusual 
and has not recurred. The meaning of the fourth column 
will be explained in Sec. VIII; the results indicate that 
a polarizing time of 10 minutes is sufficient to produce 
an equilibrium polarization value with this exciting 
intensity. Fast electrons are also very effective in dis- 
charging polarization, as is indicated in Table V. 

Since both substances show large polarizations under 
visible light excitation and both are highly fluorescent 
(blue in anthracene; yellow in K), it seemed possible 
that this fluorescent light is to some extent reabsorbed 
and contributes considerably to the observed polariza- 
tion. The experiments have verified this idea. Two fused 
samples of anthracene were used. One was placed 


TABLE III. Visible light polarization in quenched sample K. 
Source of light—tungsten bulb. Polarizing time=5.0 minutes. 
Polarizing. voltage= 50.0 volts. 








Filter (Corning 
glass) 


Cutoff wavelength 


(short wavelength end) Polarization (Ve) 





3-70 
3-67 
2-60 


4900 A 
5350 A 
6100 A 


33.0 units 
31.8 units 
22.0 units 


Dark polarization 2.6 units 








directly above the ultraviolet lamp. Between it and 
the second sample a Corning glass 3-73 filter was inter- 
posed which cut out any ultraviolet not absorbed by 
the first sample and transmits only the fluorescent light 
excited in the first sample. A polarizing voltage was 
applied to the second sample. Figure 12 is a curve of V, 
as a function of the polarizing time. The polarization 
voltages obtained in this manner are a large portion of 
those produced by direct ultraviolet irradiation. Experi- 
ments of the same type were performed on quenched 
samples of K. The effect is exactly the same; irradiating 
the sample with the fluorescent light from a second 
sample produced a V, voltage well above 100 volts after 
3 minutes polarizing time with a polarizing voltage of 
200 volts; again a considerable portion of the polariza- 
tion obtained by direct ultraviolet irradiation. The in- 
tensity of the exciting ultraviolet source was 15 micro- 
watts/cm, at the site of the first sample. 

These experiments indicate further that the observed 
slow increase of polarization observed in a thick sample 
with polarizing time is due to the absorption in the rear 
portions of the inhomogeneously excited sample of the 
fluorescent light emitted from the excited portion. This 
was tested in the following manner: two quenched 


TABLE IV. Beta-ray polarization vs time of polarization. Powder K. 
Strontium 90 source. Polarizing voltage= 200 volts. 








V cal 
Ratio — 
V 


Time of 


polarization 
(minutes) 


Lifted electrode Radiation discharge 
voltage (Vp) voltage (V) 
(volts) (volts) 





21.8 
29.3 
43.1 


1.0 41.2 
5.0 54.3 
10.0 82.0 








samples of powder K, one of the thickness 0.44 milli- 
meter and the other of 0.055 millimeter were placed 
face to face and pressed together and irradiated with 
ultraviolet light through the thicker sample, which 
completely absorbed the ultraviolet light. The two 
samples were polarized for 10 minutes and then sepa- 
rated after removal of the light and field, and grounding, 
and the polarization was measured in the thick sample. 
The measurement was repeated with the ultraviolet 
light incident through the thin sample. The polarization 
in the thick sample was again determined. The ratio 
of the polarization voltages in the thick sample for the 
irradiation through the two different sides was 1.1. 
Both measurements were repeated with a 1 minute 
polarizing period. The corresponding ratio in this case 
was 5.4. The higher polarization in both cases occurred 
with the ultraviolet incident through the thick sample. 

This experiment shows that for short polarizing 
periods, the polarization is mainly restricted to the 
region of direct excitation. For longer polarizing periods, 
the increase in polarization is mostly due to those 
regions only indirectly excited. It may be pointed out, 
however, that even with uniformly excited powders the 
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equilibrium value of the polarization is only gradually 
approached even under strong, homogeneously ab- 
sorbed, radiation in both substances; this is due to the 
gradual decrease of the effective polarizing field. 


IV. NATURE AND DISTRIBUTION OF POLARIZATION 


Up to now no complete concept has been presented 
as to how polarization is established. The most obvious 
process for producing polarization is a displacement of 
the free electrons in the conduction band, while the 
positive charge is localized in activators. This inhomo- 
geneous charge distribution of free electrons, which 
may be subsequently trapped, also produces an inhomo- 
geneous distribution of the localized positive charge 
because of the equilibrium established between positive 
and negative charges as a consequence of recombina- 
tions. Thus the final persistent polarization consists of 
electrons in traps, eventually inhomogeneously dis- 
tributed, and of an inhomogeneous positive charge 
distribution. Thus even when the traps are filled almost 
to saturation (homogeneously) polarization occurs as 
a consequence of the inhomogeneous positive charge 


TaBLE V. Preliminary measurement of sensitivity of discharge 
of polarization in K with various radiations. 








Radiation released 
voltage per 
unit radiation 


Radiation (volts/sec) /ergs/cm?) 


Ultraviolet (3660 A) 0.6 

Visible light (4000 A to 7000 A) 0.06 

Infrared light (0.8 to 1.5 microns) 10-5 

Fast electrons (10-millicurie strontium 90 
source) 

Gamma rays (1-millicurie radium source) 





cal 
cal 








distribution. The persistence of the polarization will, in 
any case, depend on the persistence of the electrons in 
the traps. That inhomogeneous distribution of elec- 
trons in traps is not imperative for polarization is 
evidenced by the following experiments: a quenched 
sample of K was polarized with field and concomitant 
ultraviolet radiation. It was then thoroughly quenched 
so that no dark polarization occurred and irradiated 
with ultraviolet (ca 20 microwatts/cm?) for a period of 
two hours with both electrodes at ground. The total 
energy absorbed per unit area in this time is 1.4 10° 
ergs/cm?. This is sufficient to fill most traps existing in 
the sample (ca 10'* per unit volume). It was then ultra- 
violet polarized under exactly the same conditions as 
the previous case. The polarization observed in these 
two measurements are given in Table VI. The same 
type of experiment was repeated using fast electrons as 
the exciting and pre-exciting source. This was done to 
preclude any interference in the previous measurements 
because of the inhomogeneous absorption of ultraviolet 
light. It is apparent from the data that pre-excitation 
has little effect upon the value of polarization under 
ultraviolet light or fast electrons, showing that an 


lifted electrode voltage (Vp) 





polarizing time (minutes) 
i -eome £8 SY 
Fic. 12. Lifted electrode voltage in anthracene excited with 


anthracene fluorescent light as function of polarizing time. 
(Polarizing voltage= 200 volts.) 





inhomogeneous distribution of electrons in traps is not 
necessary for strong persistent polarization. A few 
further measurements have been made for shorter 
polarizing periods and these agree with the afore- 
mentioned long-time results. 

The previous discussions assumed that the polariza- 
tion was created by a separation of charge internally 
and their subsequent localization. There are various 
other possible ways in which polarization may be 
established. It is possible that the sample as a whole is 
no longer electrically neutral due to charges removed 
from or injected into the sample. It was found, how- 
ever, that if the polarity of the applied field was re- 
versed the sign of the polarization voltages were also 
reversed. Since the signs of the charge transferred into 
or out of the sample should not be changed by reversing 
the field, the observed polarization cannot be accounted 
for by any charging of the sample. Measurements 
performed with the lifted electrode and the radiation 
discharge method showed no trace of an excess charge. 

In a neutral sample persistent polarization can be 
caused either by an injection of charge of one sign from 


TABLE VI. Effect of pre-excitation upon polarization in K. 








(A) Effect of ultraviolet pre-excitation upon ultraviolet polarization in K. 
Ultraviolet excitation. Polarizing voltage =50 volts. Polarizing time =5.0 
minutes. 

Lifted electrode voltage 

Pre-excitation time (min) (arbitrary units) 
0.0 35.3 
20 33.7 
120 36.1 


(B) Effect of beta-ray pre-excitation upon beta-ray polarization in K. 
Strontium 90 source. Polarizing voltage =200 volts. Polarizing time =10.0 
minutes. 








Lifted electrode voltage, 
Pre-excitation (roentgens) Ve (volts 





0 92 
1X 104 88 
1.7105 96 
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one of the electrodes and a simultaneous ejection of 
charge of the same sign at the other electrode (see any 
discussion of electrets); by a separation of mobile 
charge and their subsequent localization; or by dipole 
alignment. The sign of the charge near the surface of 
the sample was always found to be opposite to the sign 
of the potential applied to the adjacent electrode during 
polarizing. This indicates that the polarization is due 
to a separation of mobile charges inside the sample, 
since an injection of charge would bring about the 
opposite sign of polarization, and dipole polarization 
seems unlikely because of the parallelism between these 
effects and those of photoconductivity. 

The neutrality of a polarized sample and the exclusion 
of injection and ejection of charges is further confirmed 
by the fact that a sample can be polarized as well with 
contact electrodes as with both electrodes insulated 
from the sample by means of mica plates. 

Experiments were performed to determine the distri- 
bution of polarization inside the sample. Charges may 
be separated into two clouds, so that in the part of the 
sample adjacent to the negative electrode a positive 
charge prevails and in the other part a negative charge, 
or the polarization may be uniform throughout -the 
sample. In the first case the sample could be divided 
by a plane parallel to the electrodes into two parts, 
each of which should have an excess charge of a sign 
opposite to that of the adjacent electrode during the 
polarizing process. 

To discriminate between the two possibilities, two K 
samples, both mounted on conducting glass, were 
pressed together with powder surfaces in contact 
and polarization measurements were performed. The 
arrangement, treated as a single sample, showed the 
normal polarization properties when tested with the 
radiation discharge method. After repolarizing, the two 
samples were separated. The magnitude and sign of 
the charge at the two interface surfaces were indi- 
vidually determined by the lifted electrode method. 
The sample that was in contact with the negative elec- 
trode during the polarizing process displayed a negative 
charge at the interface surface (surface in contact with 
the second sample). If this interface was covered with 
an aluminum foil electrode, the voltage released by the 
radiation discharge produced a positive voltage signal, 
in agreement with the results obtained by the lifted 
electrode method. The sample on the other electrode 
exhibited a positive surface charge on the interface 
surface. This indicates that both samples were similarly 
polarized in a local type of polarization throughout the 
whole sample. Such a uniform polarization distribution 
may be attributed to the grain structure of the samples. 
Inside an individual grain a true separation of space 
charge may occur. Hofstadter? and Smith® report a 

macroscopic separation in nearly ideal single crystals. 

It may be noted from such measurements that re- 
moving the aluminum foil while it is strongly attracted 
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by the field has little effect upon the persistent polar- 
ization. 
V. DARK POLARIZATION 


As already mentioned, some of the inorganic sub- 
stances can be polarized in the dark (without con- 
comitant radiation) to values which are not too much 
smaller than those obtained under light polarizing. 
This dark polarization occurs, with one exception (an 
electroluminescent ZnS powder), only after the sub- 
stance was pre-excited; it increases with increasing 
pre-excitation and decreases with increasing time lapse 
between pre-excitation and polarizing, and vanishes 
completely only when the powder is de-excited by 
infrared radiation before polarizing. Whereas polar- 
ization can be discharged by any kind of ionizing 
radiation, the capability of being dark-polarized can 
be destroyed only by infrared irradiation. This is due 
to the fact that dark polarization comes about by the 
displacement of the small amount of free charge which 
persists in the powder as a consequence of electrons 
liberated from traps even long periods of time after 
excitation. Therefore only those radiations which empty 
the traps and do not excite the powder prevent dark 
polarization. 

The appearance of such a dark polarization is closely 
connected to the phosphorescence and/or the slow 
decay of photoconductivity in these substances after 
excitation, which also demonstrate the presence and 
the creation of mobile charges from trapped electrons. 
The decrease in the number of free charges with in- 
creasing period of time elapsed since the removal of 
the exciting radiation, accounts for the fact that dark 
polarization performed shortly after pre-excitation is 
larger than when performed a long time after pre- 
excitation. 

The following measurements on K illustrate all this 
behavior quite clearly. Table VII(A) describes the con- 
siderable dark polarization as a function of the duration 
of pre-excitation. It is, however, after 20 minutes of 


TaBLE VII. Dark polarization in K. 








(A) Dark polarization as function of pre-excitation time in K. Ultra- 
violet excitation. Polarizing time =10.0 min, performed 30 sec after pre- 
excitation. Polarizing voltage =100 volts. 

Lifted electrode voltage, 
Pre-excitation time (min) Ve (volts) 
0 17 
10 18.3 
27.5 
64.0 





20 
Polarized with ultraviolet 


(B) Dark polarization as function of time after pre-excitation in K. 
Ultraviolet excitation. Polarizing voltage =100 volts. Polarizing time 





=10.0 min. 
Lifted electrode 
voltage, Ve 


(volts) 


Delay before 


Sample Pre-excitation 1 
polarizing 


No. time (min) 


1 20 30 seconds 

1 20 20 hours 

2 135 30 seconds 
ps 2 135 18 hours 











PERSISTENT INTERNAL POLARIZATION 


pre-excitation, still less than half the ultraviolet polar- 
ization. 

The value of the dark polarization was however not 
strictly proportional to the length of pre-excitation. 
This is to be expected since the free charge responsible 
for the dark polarization mostly originates in the re- 
emission from shallow traps. For a given intensity of 
pre-excitation, the population of these shallow traps 
does not increase with time of pre-excitation as soon as 
this time exceeds the trap lifetime. 

Table VII(B) indicates how dark polarization di- 
minishes if the polarizing is performed a long time after 
pre-excitation. This slow decrease of dark polarizability 
with time is closely correlated to the weak dependence 
of polarization by light on the light intensity. Even 
with a low concentration of mobile charges, polarization 
can be obtained if the polarizing field is applied for 
sufficient time, just as with extremely weak light in- 
tensities polarization can be produced. 

Dark polarizability and short-time spontaneous decay 
of polarization are correlated. Substances with initially 
strong dark decays, i.e., having relatively many shallow 
traps, show large values of dark polarizations, those 
with slight initial dark decays, such as anthracene show 
only small values. A sample of anthracene, pre-excited 
with full intensity of visible light for 5 minutes, was 
dark-polarized one minute later, and gave a V, voltage 
of 12 volts, which is small compared to the polarization 
voltage of over 100 volts with simultaneous excitation. 
Dark polarizing 9 minutes after pre-excitation produced 
a V, of 5.6 volts. Quenching with infrared radiation for 
5 minutes eliminates the dark polarization. Dark polar- 
ization in anthracene is much weaker and the dark 
polarizability decreases more rapidly with time after 
pre-excitation than in K. This shows the much smaller 
number of shallow traps in anthracene than in K. 

At the other extreme are the properties of the electro- 
luminescent ZnS powder.’® This has an extremely fast 
dark decay of polarization in comparison to K. How- 
ever, the dark polarization is at least one-third of the 
ultraviolet polarization, even after strong de-excitation 
of the sample and dark polarizability cannot be elimi- 
nated by infrared de-excitation. 

Further, it was found that the time required to reach 
an equilibrium value is much longer for dark polariza- 
tion than for polarization under strong ultraviolet, 
which corresponds with the finding that polarizing under 
weak light irradiation also requires longer times to 
reach equilibrium. 

These dark-polarizable substances offer another 
method to further explore the mechanisms of polariza- 
tion. The influence of a subsequent reversal of the 
polarizing field on the polarization can be studied. 
A sample was dark polarized with a given field direc- 
tion; the field was removed and then reapplied in the 
reversed direction (this will be termed field reversal). 


45 [1V: obtained from Sylvania Electric Company. ] 


TABLE VIII. Effect of polarizing field reversal on 
dark polarization in K. 
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After this, the remaining polarization was determined 
as a function of the duration of polarizing periods of 
the original and the reversed fields and as a function of 
the respective field strengths. The results are tabulated 
in Table VIII. They show that the polarization is not 
necessarily reversed with field reversal and that the 
direction of the first polarizing field is the determining 
factor. Considerable reduction and eventual reversal 
of the initial dark polarization occurs if the reversed 
field is of the same strength and is applied for a period 
much longer than the polarizing time, or if the reversed 
field is slightly stronger than the original one. In this 
latter case a reversal of the polarization occurs when the 
time of application of the reversed field is only slightly 
longer than the polarizing time. With decreasing polar- 
izing time the time of reversed field application neces- 
sary to reverse the polarization decreases somewhat 
faster. In the case of the electroluminescent ZnS, the 
polarization was always found to be in the direction of 
the last polarizing field applied, contrary to the above 
results with K. This is a further indication of the large 
number of shallow traps in this substance and the 
possibility of a direct field excitation. 

These results could be interpreted in the following 
way: the observed dark polarization is not solely due 
to the displacement of those charges already present in 
the conduction band (which are in equilibrium with 
those found in traps) at the moment the field is applied. 
While the field is on, new electrons are emitted from 
traps at those places where the free electron density is 
depleted as a consequence of the field action, and these 
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electrons too are displaced, increasing the polarization. 
With longer time lapse from the moment of field appli- 
cation the number of electrons available decreases 
constantly, and this slows down the further increase in 
polarization. If, after a certain time ¢, the reversed 
field is applied for the period ?’, in this time ?’ less elec- 
trons are available for displacement than in the first 
time interval ¢ if ¢’ is not much longer than ?. 

Thus one can understand why the subsequent appli- 
cation of a reversed field can annul the original polar- 
ization only if the reversed field is applied for a much 
longer time than the initial field, and why the action 
of the reversed field becomes more pronounced when ¢ 
is made smaller. This picture is in agreement with the 
previously discussed findings, that the polarization ob- 
tained under concomitant illumination is not very 
dependent upon the intensity of the illumination if the 
polarizing time is extended enough. 


VI. DISCHARGE OF POLARIZATION 


Any radiation creating mobile charges will not only 
produce polarization when an external field is applied 
but will also discharge a persisting polarization in the 
absence of an external field. This discharge process will 
be theoretically discussed in Sec. VIII. 

The discharge of polarization is different when it is 
performed with both electrodes at ground from that 
performed while one electrode is isolated because of the 
build up of an external field during the latter procedure. 
In the first case the polarization can be removed com- 
pletely by a sufficiently long continuous irradiation. 

The discharge of polarization by low-intensity radia- 
tion is described in Table V for K under various types 
of radiation ; the voltage build up was determined with 
one electrode isolated and attached to the electrometer. 
K shows a noticeable spontaneous dark decay only up 
to a period of about 20 minutes after polarizing; there- 
fore all measurements reported here were made after 
the polarized substance had undergone a dark decay of 
at least one hour. The intensity of the ultraviolet light 
used for the discharge was 2.4X10-" watt/cm?. This 
produced a discharged voltage at the grid of 1.5xX10-* 
volt in one second. This corresponds to a discharge 
0.63 volt/second per erg/cm? (ca 3X10-" coulomb/sec 
per erg/cm? across the 50-micromicrofarad sample) 
of radiation absorbed. If it is assumed that a single 
photon produces one mobile electron which then moves 
completely across the sample under the internal field, 
then the quantum efficiency of the discharge process is 
about 10-*. However, if retrapping and recombination 
is taken into account, the quantum efficiency increases 
to between 10 and 1. With such low-intensity radia- 
tion the discharged voltage increases linearly with time 
over a very long period before it shows saturation 
characteristics. 

In the case of low-energy radiations other than ultra- 
violet, the intensities required to produce a certain 
discharged voltage were much larger. The values given 
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in the table are only approximate values for these 
cases. The absorption of visible light in K is much 
smaller than that of ultraviolet and this reduces the 
discharge rate correspondingly to about one-tenth of 
the value for ultraviolet. For infrared (with an intensity 
of about 15 milliwatts/cm?), which is primarily absorbed 
by the trapped electrons or the remaining ionized 
activators only, the absorption coefficient is still smaller 
and depends moreover on the number of trapped 
electrons. Infrared radiation of wavelengths of 1.5 
microns and longer have been found to be still effective 
in discharging the polarization in K. Similar wave- 
lengths also show detectable effects on photocurrents, 
primarily in the form of quenching. But the effects of 
such long wavelengths are more pronounced in the dis- 
charge of polarization than in the conductivity in- 
vestigation. 

The discharge for ultraviolet, visible, and infrared 
radiations was determined by measuring V; for high- 
energy radiations it was determined by measuring V,. 
One such curve of the gamma-induced decay in K is 
shown in Fig. 13 for 5 r total irradiation. If for a given 
polarization V, is measured and, thereafter, V, the 
results differ. The reason for this difference, as already 
pointed out, resides in the build up of the backfield, 
which prevents complete discharge of polarization when 
the electrodes are isolated. The ratio of these two 
values can be obtained by comparing Eqs. (8) and (13) 
of Sec. VIII. This gives for the ratio. V/V,=1—e. 
For K, the best evidence gives a value for « (determined 
by V/Vpo ratio) of 0.30. Thus V/V,=0.7. The agree- 
ment with the value of 0.6 given in Table V is excellent, 
though perhaps to some extent fortuitous. ‘ 
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Fic. 13. Dark decay of polarization in K showing increased 
decay rate under gamma-ray irradiation (polarizing voltage= 200 
volts; polarizing time=3 minutes; ultraviolet intensity = 15 micro- 
wette/cul). 
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During irradiation, V displays an increase with time 
given by V=Vinaxl1—exp(—é/r)] and approaches a 
saturation value. When this is reached, further irradia- 
tion produces no further increment in voltage. If, after 
the removal of the radiation, the discharged voltage is 
grounded off and the radiation is reapplied, a new 
voltage build up begins again. This too approaches a 
saturation value, which is, however, smaller than the 
previous value and is reached after a longer period of 
irradiation than the first. In K this discharging process 
can be repeated 4 or 5 times before V drops below 
5 percent of the initial V. In anthracene the process can 
be repeated under certain conditions much more often, 
and still produces considerable discharge voltages after 
more than 10 repetitions. The values of these successive 
discharge voltages are plotted in Fig. 14 for each 
repetition for both substances. The voltage V reached 
in one run has been found to be independent of the kind 
of discharging radiation and of its intensity as should be 
expected. Short flashes if strong enough produce the 
same V as weak.long time irradiation. The ultraviolet 
light source (ca 15 microwatts per square centimeter) 
and the visible light source were both used to discharge 
the polarization. In both cases the maximum voltage V 
reached was identical to that produced with the flash 
unit in 1/2000 of a second. The time of irradiation 
required to reach the maximum voltage was of the 
order of seconds for visible light and a minute for the 
ultraviolet, because of the several hundred times larger 
intensity of the visible light source. 

The dependence of the saturation values for con- 
secutive discharges upon the applied polarizing voltages 
is shown in Figs. 6 and 7 for anthracene and K respec- 


radiation discharge voltage (¥) 


number of repeated discharges * 








Lee) See 26 cer. 8 
Fic. 14. Maximum radiation discharge voltages as function of 
the number of discharge repetitions. (A) experimental values in 
anthracene; (B) theoretical values for anthracene; (C) experi- 
mental values in K. (Polarizing voltage=400 volts; polarizing 
time=1 minute; ultraviolet intensity=15 microwatts/cm?.) 
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Fic. 15. Typical polarization dark decay curves in: (a) anthracene; 
(b) K; (c) electroluminescent ZnS powder. 


tively. These figures will be discussed in Sec. VIII. 
They show that V is proportional to the polarization 
and thus to V,. It may be noted that the intensity of 
the discharging radiation is much more than sufficient, 
from an ordinary quantum efficiency standpoint, to 
release all the polarized charges. 

That ultraviolet radiation is a very effective dis- 
charging agent is due to its strong absorption and the 
fact that practically each absorbed quantum creates a 
free electron, which moves in the interior polarization 
field in such a way as to compensate the existing polar- 
ization. Thus the discharge of polarization in this case 
is not so much due to a removal of the trapped charges 
as to a canceling out of the prevailing inhomogeneous 
charge distribution. Only infrared radiation effects the 
trapped charges themselves. One would expect a smaller 
efficiency for the same amount of absorbed energy of 
high energy radiation than for ultraviolet radiation, 
since the energy consumption for creating one free 
electron by high-energy radiations is larger than for 
ultraviolet radiations. The measurements in Table V 
show, however, that both efficiencies are of the same 
order of magnitude. This is probably due to the in- 
homogeneous absorption of the ultraviolet. It was 
found, that ultraviolet light discharges the polarization 
mostly in the region where it is mainly absorbed and 
only slightly in the rest of the sample. High-energy 
radiation excites, and therefore discharges, the sample 
uniformly. 

If the electrodes are kept at ground during the first 
discharging irradiation, there will be no field buildup to 
restrict the polarization discharge. The amount of 
polarization discharged by this first irradiation will then 
be larger than with isolated electrodes, and a second 
irradiation with the electrode now isolated, will furnish 
a V smaller by about a factor of 2 than if the first 
irradiation was given with the electrode isolated, be- 
cause of the smaller residue of polarization. 


VII. ELECTRICAL STORAGE AND ITS DECAY 
The spontaneous decay of the polarization consists 
of two relatively distinct portions; a fast decay during 


the first 20 minutes, followed by a very slow decay over 
a very long period of time. This is illustrated in Fig. 15, 
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Fic. 16. Dark decay of polarization in anthracene for two 
polarizing voltages, 100 and 150 volts, with: (a) ultraviolet and 


(b) visible light excitation at each voltage (polarizing time=5 
minutes). 


which describes V, as a function of time, for anthracene, 
K, and an electroluminescent ZnS. Such curves are 
characteristic for a given substance but their exact 
shape depends upon the polarizing voltage, the thickness 
of the sample and the polarizing radiation used as well 
as the ambient temperature (not yet investigated in 
detail). If the curves of Figs. 15 and 16 are extrapolated 
to longer times it is seen that the further decay of 
polarization is very slow. This was experimentally 
verified by measuring the decay of polarization in K 
and anthracene. In K, after 100 hours, the polarization 
was still 60 percent of the short time value. In anthra- 
cene large values were measured after a month of dark 
decay. V, V,, and the corresponding Q, for three par- 
ticular cases are given in Table IX, for different decay 
times. These values are comparable to those obtained 
in electret work. Figure 16 presents the curves for 
ultraviolet and visible light polarization which notice- 
ably differ from each other although they display the 
same initial polarization. The latter polarization dis- 
plays a lower value of the long persistent part and the 
initial decay is faster for both substances. The persistent 
part of the polarization increases percentage-wise with 
increasing voltage and increasing thickness for both 
powders. Experiments are now in progress to determine 
more accurately the actual connection between the 
shape of the decay curve and the occupation of the 
traps. Recent experiments (to be reported in a subse- 
quent paper) indicate methods whereby the rate of 
decay can be substantially reduced. 

The short-time dark decay was measured most accu- 
rately from 30 seconds to 20 minutes by connecting 
the sample to the electrometer immediately after being 
polarized. The electrodes were at ground except for the 
brief period of the measuring times. dP/dt, the rate of 
decay of polarization, for a thin sample of K, was 
found to decrease with time approximately as c/t, where 
c is a constant. After 20 minutes the rate decreased to 
a value below the present apparatus sensitivity. 

If, a sufficient time after the sample shows no further 
detectable short time decay, a strong shot of light is 
given and the resulting voltage is grounded off, then an 


appreciable short-time dark decay occurs again. This 
too disappears after several minutes. Such a behavior 
strongly resembles the behavior of storage phosphors 
when they are light stimulated. The discharging radia- 
tion produces mobile charge which is partially retrapped 
in shallow levels and produces the new short time decay. 

It has been found that in conditions of even moderate 
humidity, the polarization decay in powders is ab- 
normally rapid. Therefore, all measurements were per- 
formed in a desiccated chamber. 


VIII. PHENOMENOLOGICAL MECHANISM OF 
POLARIZATION 


A detailed atomistic description of the polarization mechanism, 
taking into account the picture of Kallmann and Kramer” on the 
various charge transitions possible in a photoconductive sub- 
stance, is in the process of preparation and will be reported in a 
later paper. Here a simple model for the case of equilibrium 
polarization will be described. 

Although the polarization is distributed throughout the sample 
in a uniform way, it will be assumed here as a simplification that 
the observed polarization can be represented by two charge layers 
in the interior of the sample with a surface charge density o2- and 
o2* located at distances d, and d; from the positive and negative 
electrodes respectively (see Fig. 17), and that di=d; and |o2"| 
=|o2+|. Then de, which is the separation distance of the two 
charge layers, is a parameter which describes the polarization and 
can be evaluated from the experiments. 

If an external field is applied while mobile charges are being 
created, the field picture as shown in Fig. 17 can be described as: 


Vo=2E dit Ecd2. (1) 


The unknown charge density, o2~, can be linked with the distance 
d, by two possible assumptions. One is that o2~ will show no 
further increase in time when the field EZ: becomes zero. The 
second is that the rise of o:~ is limited when the force on the 
charge layer approaches zero. Both conditions are rather artificial 
and it would be difficult to say which may give a better presenta- 
tion. Calculations made show that the final results are identical in 
terms of a constant e (defined below); only the meaning of « is 
different for both assumptions. The choice in the following de- 
velopment will be the first assumption since this leads to d2=do 
when e=1, rather than d2=do/2 when e=1 as results from the 
second assumption. This assumption that E,=0 for equilibrium 
implies that the body photocurrents in the steady state do not 
essentially influence the polarization. 
From elementary electrostatics, we have: 


E2= (Vo/do) — (81x) (di/do)os-, (2) 


TABLE IX. Long time persistence of P.I.P. 
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PERSISTENT INTERNAL POLARIZATION 


and, if we use the first assumption that Z.=0 at equilibrium, 
|os~leq= Vox/(4edol1—e]); e=d2/do. (3) 


The equilibrium charge density, |o2~|eq, in Eq. (3) will of course 
be reached only if the total number of mobile charges created 
during the polarizing time is sufficient. If the exciting source and 
then the applied voltage are removed and the electrodes grounded, 
an image charge appears at the electrodes and the total voltage 
drop across the sample is zero. The new field EZ’ is not zero, but 
there is only very little charge motion since most of the charges 
are frozen in traps. The new field distribution is then: 


2Ey'd;+E2'd.=0. (4) 
The image charge density on the electrodes is given by: 
|oo| = Voxe/[4ardo(1—«)]. (5) 


The direction of the new field, Z2’ is the reverse of the original 
direction when the applied field was on, and thus would tend to 
decrease the polarization if mobile charges are present. This is the 
meaning of the negative signs in the expressions for E’: 


E,! = — (41/k) | 027 |eq(1—€). (6) 


Under the influence of this field the small number of mobile 
charges always present move constantly in a direction to decrease 
a2, and thus produces spontaneous decay of polarization. This 
shift of charges is the same as that observed in the dark-polarizing 
of the substance. 

If the sample is irradiated, more mobile charges are created, and 
they also move in the direction of this field and decrease the 
polarization. In doing so, a portion of the image charge is freed, 
and creates an external voltage (V) across the electrodes when 
these are isolated during the irradiation. The new field distribution 
during depolarization, described by barred quantities, is: 


2F.d,+E.d.= v. (7) 


This depolarization continues until again E,=0, at which time 
no further decrease of polarization can occur. Then Vimax, the 
radiation discharge voltage described in previous sections, is 


given by: 
Vmax= Vo, (8) 


since EB, (=E,’) does not change during irradiation when one 
electrode is isolated. The calculations assume that the sample is 
isolated from ground during depolarization and that there is no 
other capacity than that of the sample itself (the grid input 
capacity of the electrometer is here considered negligible, though 
in practise this is not always so). The total released charge is 
given by: 
q=VoxeA /4rdo, (9) 
if one uses the unpolarized sample capacitance. Vinax is the 
maximum radiation discharge voltage that can be produced by a 
single irradiation of sufficient intensity. A second irradiation after 
grounding Vmax off (in the dark) would produce a new potential 
V2=€Vo, and for the mth irradiation one obtains: 
Vn=e"Vo; qn=Voxe"A/4ardo. 
For the sum of all the voltage maxima, one has: 
Ln Va=Vo/(1—6). (11) 
This is larger than Vo since 0<e<1. The factor e can be looked 
upon as a parameter which has to be determined from the experi- 
ments. It is assumed that ¢ is independent of the applied voltage 
to a first approximation. 
The lifted electrode voltage V, appearing at the electrometer 
when the electrode is lifted follows from the same model and is: 
Ve=00A /C= VoxeA /4rdoC(1—e), (12) 
where C is the electrometer capacity and A is the area of the elec- 
trode. Instead of V, it is advantageous to use V, [defined in 
Eq. (a), Sec. II] since it is independent of the electrometer 
capacity. 
(13) 


(10) 


Vp=ViC/ (kA /4xdo) = Vo/(1—<), 
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Fic. 17. Diagram of double charge layers in 
phenomenological model. 


where («A /4mdo) is the capacity of the unpolarized sample. It is 
apparent from this that for values of ¢ larger than 0.5, the voltage 
V, will be larger than Vo, the polarizing voltage. This is the case 
illustrated in Fig. 9, where even for low polarization times (with 
high intensities) this occurs. 

If Eqs. (8) and (13) are used to eliminate ¢, there results a 
relation between purely experimental parameters. 


V=V,/(1+V>/Vo). (14) 


These relations may be compared with the experimental data, 
but it should be borne in mind that all these only hold for equi- 
librium polarizations, whereas, for many of the data reported in 
this paper, such an equilibrium was only very roughly approached. 
The data presented in Figs. 6 and 7 were taken with a polarizing 
time of only 1 minute and represent polarizations well below 
equilibrium as can be seen from experiments with longer polarizing 
times and with higher polarizing intensities. However, the voltage 
reached in the second discharge can be considered as the equi- 
librium polarization for an applied voltage equal to the first 
radiation discharge voltage, and similarly for the subsequent 
discharges. 

Thus, if one uses the V values of curve 1 as the Vo values for 
curve 2, the calculated ¢ varies from 0.70 at Vo=62 to 0.82 at 
Vo=38. This indicates that e varies with Vo as e=1—aVo for 
these measurements. There is, however, other evidence from 
measurements of the ratio of V to Vo for equilibrium polarization 
in anthracene which indicates that e is fairly constant over a wide 
range of Vo (see for example Figs. 8 and 9). This discrepancy is 
being investigated at present. Another method of calculating « is 
possible from repeated discharge maxima V for a single original Vo. 
This is given by the vertical column of points in Fig. 6. Using the 
value of e taken for the curves 1 and 2 of the 400-volt column, the 
succeeding V values are calculated using Eq. (10) and the relation 
of e and Vo given above. These values are plotted in Fig. 14 as 
the theoretical curve, , and the actual values are given as curve a. 

In Fig. 3 the voltages Vp and V are compared for various 
polarizing times in a thin fused anthracene sample. The curves 
indicate that polarization under the stated polarizing conditions 
approaches the equilibrium state after 200 seconds polarizing 
time. Both methods of measurements indicate this approach to 
equilibrium. This equilibrium time is shorter than for the other 
samples; this is probably due to the fact that this sample, thinner 
than usual, is more homogeneously polarized. 

. From the equilibrium section of the curve for Vp, a value of 
0.56 is computed from Eq. (13). Putting this value into Eq. (8) 
yields a V of 11.2 volts. The measured value is 10.9 volts, an 
agreement within 3 percent of the calculated value. It may be 
noted that the ratio of Vp to V is only slightly dependent on the 
polarizing period, whereas the individual values sharply increase 
with increasing polarizing period. In Fig. 18 the ratio of V, 
calculated according to Eq. (14), to the measured value of V is 
plotted against polarizing time. It is seen that for short polarizing 
periods the calculated value is higher than the measured. For 
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Fic. 18. Ratio of calculated maximum radiation discharge voltage 
to measured value (taken from the data of Fig. 11). 


longer times, approaching the equilibrium period, the ratio 
approaches 1. This means that the e calculated from Eqs. (8) 
and (13) agree with each other only for equilibrium polarization. 
A further confirmation of the equations of the model has been 
given in Table V for fast electron polarization in K where the 
final column gives the ratio of the calculated to the measured 
values of V. 


IX. SUMMARY REMARKS 


Persistent internal polarization comes about by the 
displacement of free charges inside the material and 
their subsequent localization in deep traps. A re- 
quirement for persistence is that the free charges 
disappear fast after the polarizing field is removed, 
otherwise these charges would tend to discharge the 
produced polarization spontaneously. Thus photocon- 
ductive materials are polarizable while they are irradi- 
ated with any kind of radiation or shortly thereafter as 
long as free charges may exist. After removing free 
charges by infrared quenching the polarizability by 
field alone no longer exists. Electroluminescent ma- 
terials are also polarizable because of the creation of 
free charges by the application of electrical fields. The 
spontaneous decay of polarization comes about by the 


motion of the remaining free charges in the internal 
polarization field. This shifts the charges in such a 
direction as to annul the polarization. If practically no 
free charges persist, polarization still decays spon- 
taneously when charges bound in shallow traps are 
present which evaporate into the conductivity band. 
Thus a condition for strong, long persistent polarization 
is a large number of deep traps and only a small number 
of shallow traps. In agreement with these considera- 
tions, the materials K and anthracene were the best 
polarizable substances. They also show the fastest decay 
of photocurrent and light emission after the excitation 
is removed. 

In this connection it may be realized that the total 
amount of charges trapped is larger than the amount 
necessary for maximum polarization. In our polariza- 
tion measurements, we find the polarization charge to 
be between 10- and 10-*® coulombs per cm?, that is, 
10° to 10" electrons/cm?. Even if one assumes that 
each electron has a displacement of only 1/100 of the 
maximum displacement (viz., the sample thickness) the 
total amount of electrons necessary to produce the 
polarization would range from 10" to 10" electrons in 
our layers. Since these layers are of the order of 7g mm 
this number if still smaller than the number of traps 
usually observed in the inorganic phosphors which is of 
the order of 10'* per cubic cm. This must be also borne 
in mind when the decay of polarization is considered. 
The total amount of free charges required to cancel the 
polarization is larger than the polarization charge 
itself. 

Removal of polarization occurs by the creation of free 
charges by any kind of radiation. They move in the 
internal field to cancel the polarization. This removal 
of polarization charge can be accelerated by the appli- 
cation of a reversed field. 
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A brief summary of some of the important experimental and 
theoretical work related to the subject of metallic sputtering is 
presented. The need for measurements in high vacuum is indicated 
and an ion beam which utilized a Philips Ion Gauge discharge 
ion source to make high-vacuum sputtering ratio measurements 
is described. Absolute sputtering ratio data for the gas-metal 
combinations Ag-Kr, Ag-A, Ag-Ne, Ag-He, Cu-Kr, Cu-A, Pb-A, 
and Pb-He are presented in terms of the number of atoms sput- 
tered per incident ion, ra, versus incident ion energy, Eo, for ion 
energies varying between 400-6100 ev. The data are interpreted 
by treating the incident ion as a hard sphere which “cools” in 
a manner similar to a neutron losing energy by collisions in a 
lattice, each collision producing recoil atoms and atomic displace- 
ments near the surface. The number of atoms escaping, or 
“sputtering,” from the metallic surface is reduced from the 
number displaced by absorption within the metal which is ac- 
counted for by a parameter a. By use of elementary neutron 
cooling theory and the Seitz formula for displacements produced 
by a recoil atom within a solid, the formula for the number of 


atoms sputtered per incident ion is given by 


eEo\ 1 cE 
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for the case of ions more massive than the metallic atoms. The 
effect of ions rebounding from the surface after the first collision 
is considered to produce effectively two types of incident current 
particles; (1) incident gas ions and (2) recoil metal atoms. These 
considerations lead to a modified sputtering ratio formula, which 
reduces to the above equation when M,;=M). The displacement 
energy for the process, Ez, is calculated by use of G. K. Wehner’s 
data on sputtering thresholds and the relation E;=Ea/e. A fair 
fit to the experimental data is obtained by suitable choice of a 
in the modified formula for the cases studied. 

The use of “hard” collisions is justified and an equivalent ion 
energy shift defined by equal average energy transfer on the first 
ion-atom collision is applied to the data. The subject of sputtering 
thresholds is treated by an attempt to bracket observed thresholds, 
E,, between limits defined by the atomic heat of vaporization, the 
displacement energy, and the average energy transfer per collision. 





INTRODUCTION 


HE breakdown of a metallic surface due to posi- 
tive ion bombardment is a phenomenon which 
has been observed since the earliest investigations in 
the field of gaseous electronics. Because of its basic 
nature and relative ease of attainment, this effect, 
termed “sputtering,” has subsequently been the basis 
of many experiments which have been reported in the 
literature. In most of these experiments, the glow dis- 
charge served as a source of ions and the rate of 
sputtering was measured by indirect means because of 
experimental difficulties. In order to make a direct 
measurement of the number of atoms sputtered per 
incident ion it is necessary to: (1) have a sufficiently 
large mean free path near the metallic surface to allow 
escape of all sputtered atoms, (2) return secondary 
electrons to the target, (3) have a source of ions with 
sharply defined energy, and (4) directly measure weight 
loss of the material sputtered. The glow discharge did 
not serve as an ion source which could satisfy any of 
the above requirements until the gas pressure was 
reduced to a lower limit of about ten microns. 
The experiments of Penning and Moubis,! Giinther- 
schulze and Meyer,’ and Giintherschulze* are repre- 


* Abstracted from a thesis presented to the University of 
Southern California in partial fulfillment of the requirements for 
the Doctor of Philosophy Degree. 

+ Part of this work was sponsored by the U. S. Office of Naval 
Research. 

t Presently Member of Technical Staff, Bell Telephone Labora- 
tories, 463 West Street, New York, New York. 

1F, M. Penning and J. H. A. Moubis, Proc. Acad. Sci. Amster- 
dam 43, 41 (1940). 

2 A. Giintherschulze and K. Meyer, Z. Physik 62, 607 (1930). 

3 A, Giintherschulze, Z. Physik 110, 149 (1941). 


sentative of reliable studies which, by careful design, 
utilized the glow discharge as an ion source. Penning 
and Moubis were able to obtain measurements by use 
of a discharge employing an axial magnetic field parallel 
to a cylindrical cathode. The field aided in returning 
secondary electrons to the cathode and in reducing the 
operating pressure of the glow discharge. The amount 
of material sputtered was measured by the gain in 
weight of mica disks placed in a strategic position 
opposite to the cathode. Giintherschulze and Meyer* 
were able to make reliable measurements by running a 
glow discharge between a heated filament and a cathode 
with a hole in it to permit some of the ions to pass 
through. These ions struck a target which was sus- 
pended directly above the cathode by a spring at a 
negative potential of 200 to 1200 volts. As the target 
lost weight, it would rise slightly, the amount of rise 
being a measure of the weight loss. The measured 
sputtering rates were independent of pressure at pres- 
sures less than 0.010 mm Hg. 

Giintherschulze® eliminated the effect of back diffu- 
sion by use of a cylindrical cathode surrounding a wire 
anode. Assuming the pressure of sputtered material to 
be constant throughout the vessel, the rate of deposition 
on the anode was identical with the rate of removal 
from the cathode. Measurement of the weight gained by 
the wire anode gave information on the sputtering rate 
which agreed closely with that of Penning and Moubis. 

The General Electric Company, Ltd., London, En- 
gland,‘ measured sputtering rates by use of a pure 
tungsten wire mounted in a tube with additional elec- 


4 General Electric Company, Ltd., Phil. Mag. 45, 98 (1923). 
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trodes for operating a glow discharge. Application of a 
negative potential to the wire while the discharge was 
operating caused the resistance of the wire to increase 
due to surface sputtering and consequent reduced 
cross-sectional area. Thus, the change in resistance of 
the wire afforded a measure of the amount of material 
sputtered from the sample. This method was suited for 
measuring sputtering thresholds and values were found 
ranging from 25-80 volts for argon to 700 volts for 
hydrogen. 

Wehner' has recently performed notable experiments 
to determine the sputtering threshold of a large number 
of metals bombarded by mercury ions accelerated from 
a plasma at low pressure surrounding the metal target 
under study. He has developed an empirical formula 
for the sputtering threshold which involves the velocity 
of sound in the metal, the average energy transfer per 
collision between ion and metal atom and the heat of 
sublimation. In addition, by collecting the sputtered 
deposits from a metal strip in a plasma, Wehner® has 
been able to show that Hgt ions incident near the edge 
of the strip, i.e., at an oblique angle, will require less 
energy to begin sputtering than those which are 
normally incident near the center of the strip. 

Bareiss’ measured the resistance of gold-foil anodes 
bombarded by electrons of energy 200-500 ev. If there 
had been any removal of gold by sputtering at these 
anodes, the subsequent resistance change could have 
been easily detected; however, in no case was he able 
to find any indication of sputtering due to electrons at 
these energies. 

Numerous theories of cathode sputtering have been 
proposed,“ the generally accepted ones being those 
of von Hippel" and Kingdon and Langmuir.” The 
evaporation theory proposed by von Hippel, although 
not completely satisfactory, has been considered the 
most plausible explanation of the sputtering phenome- 
non and is more widely accepted than the latter, which 
is based upon momentum interchange and surface con- 
dition. According to von Hippel, an ion striking a point 
on a cathode surface will distribute its energy in the 
form of heat over a localized “hot spot” of atomic 
dimensions. The resultant extreme temperature will 
cause vaporization from a surface element, AF, for a 
short time interval, A/, the rate of vaporization being 
dependent upon the incident ion energy and the physical 
properties of the cathode. Sputtering rates were ex- 
pected to rise exponentially to a maximum at ion 


5G. K. Wehner, Phys. Rev. 93, 633 (1954). 

6G. K. Wehner, J. Appl. Phys. 25, 270 (1954). 

7 Max Bareiss, Z. Physik 68, 585 (1931). 

®U. K. Bose, Indian J. Phys. 12, 95 (1938). 

*V. Bush and C. G. Smith, Trans. Am. Inst. Elec. Engrs. 41, 
402 (1922). 

10 C, H. Townes, Phys. Rev. 65, 319 (1944). 

11 A. von Hippel, Ann. Physik 81, 1043 (1926). 

2K. H. Kingdon and I. Langmuir, Phys. Rev. 22, 148 (1923). 

13 J. J. Thomson, Rays ra Positive Electricity (Longmans Green 
and Company, London, 1 21). 

4 C, Starr, Phys. Rev. "36, 216 (1939). 
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energies of about one thousand volts. The decrease at 
higher energy was attributed to ion penetration of the 
surface with subsequent increase in the number of 
atoms affected at the “hot spot” resulting in reduced 
average temperature over the area AF. Von Hippel’s 
theory showed correlation between certain sputtering 
series observed in a glow discharge and the latent heat 
of vaporization of the metal sputtered. The mass de- 
pendence of the phenomenon was inferred from the 
assumption that heavier ions would not penetrate the 
surface as much as lighter ions of equal energy. Townes!” 
has applied the above theory to describe sputtering in 
a gas at high pressure in which case the diffusion of 
sputtered material from the cathode is a dominant 
factor. Kingdon and Langmuir” proposed a description 
of the sputtering phenomenon which was suggested by 
their observations on the removal of thorium from a 
thoriated tungsten filament by positive ion bombard- 
ment. The emission vs time characteristic of the fila- 
ments showed an unexpected behavior, since, instead 
of an immediate reduction in emission with onset of the 
discharge, in some cases, the current remained steady 
for a short period and then began to decrease monotoni- 
cally to a value near that for a pure tungsten filament. 
In order to explain these results, Kingdon and Langmuir 
hypothesized a “crevice” theory which was assumed to 
occur in two steps. The first step was the formation of 
a dent or ‘‘crevice” at the point of ion impact, the 
incident ion recoiling in the backward direction; sput- 
tering could then result when a second ion struck this 
crevice and knocked out thorium atoms from around 
its edge. The steady period of emission was then con- 
sidered to be during the time ions were forming crevices 
on the thoriated tungsten surface. Seeliger and Sommer- 
meyer'® performed an experiment which was designed 
to make a comparison of the momentum exchange 
theory and the evaporation theory. They drew ions 
from a glow discharge into an evacuated region and 
accelerated these canal rays to strike a molten gallium 
target at varying angles of incidence. The angular dis- 
tribution of sputtered material collected on a cylinder 
surrounding the target indicated that Knudsen’s cosine 
law described the density variation with respect to the 
normal. This was considered evidence in favor of the 
evaporation theory since the momentum exchange 
theory appeared to predict a preferred direction for the 
sputtered atoms. However, the evaporation theory has 
also been considered to have shortcomings since it does 
not satisfactorily explain (1) the dependence of sputter- 
ing on ion mass (2) sputtering thresholds being many 
times greater than the atomic heat of sublimation 
(3) secondary emission due to ion bombardment being 
considerably less than would occur if there were heating 
and consequent thermionic emission at the surface 
(4) the lack of marked temperature dependence of 
sputtering rates. 


18 R. Seeliger and K. Sommermeyer, Z. Physik 93, 692 (1935). 





HIGH-VACUUM SPUTTERING 


It is generally recognized that, despite the consider- 
able amount of work which has been devoted to its 
study, the phenomenon of sputtering has not been 
explained to any full degree of confidence. This may 
be largely attributed to the difficulty in making sputter- 
ing measurements in terms of the number of atoms 
sputtered per incident ion. Timoshenko!* performed an 
experiment which utilized a capillary arc ion source to 
measure absolute sputtering ratios for argon ions in- 
cident on silver. His experiment was conducted under 
apparently ideal conditions since the ions from the 
source were accelerated through a known potential 
difference into a sputtering chamber at high vacuum; 
secondary electrons at the target were returned and 
there was no back diffusion of sputtered atoms. These 
measurements indicated a sharp rise in the sputtering 
ratio when the ion energy was increased above 2800 
volts. This writer!’ has essentially repeated Timo- 
shenko’s experiment by use of a Philips Ion Gauge 
discharge ion source and has observed that sputtering 
rates decrease gradually with decreasing ion energy. 
Moreover, it was stated in the previous letter that the 
sputtering ratios for argon ions incident on silver could 
be closely fitted to the number of collisions made by 
the ion in a process similar to neutron cooling with the 
silver lattice acting as a moderator. This led to a 
logarithmic form for the curve of sputtering ratio vs 
incident ion energy. The purpose of this paper will be 
to present additional data and to give a description of 
the sputtering process which includes the elementary 
concepts of neutron cooling and radiation damage 
theory to a metallic surface. For the sake of clarity, 
the experimental method will be described in greater 
detail than in the previous letter and some arguments 
in favor of the radiation damage concept of high- 
vacuum sputtering will be considered. 


EXPERIMENTAL 


A. Experimental Equipment 


A Philips Ionization Gauge (P.I.G.) discharge has 
been used as a source of ions in this investigation be- 
cause of its ability to supply ion currents the order of 
30 000 microamperes per cm? at a gas pressure of a few 
microns. The method of forming a beam of ions was 
similar to that of Keller'® who used the P.I.G. discharge 
for high-energy bombardment by alpha particles. Opera- 
tion and characteristic behavior of this type discharge 
have been described by Penning’® and Backus.”° The 
discharge geometry in this study consisted of a pair of 
water-cooled copper cathodes separated by two plane 
grounded plates with one-half inch apertures for de- 

16 G, Timoshenko, J. Appl. Phys. 12, 69 (1940). 

1’ F, Keywell, Phys. Rev. 87, 160 (1952). 

18 R. Keller, Helv. Phys. Acta 21, 170 (1948). 

1 F, M. Penning and J. H. A. Moubis, Physica 4, 71 (1937). 

* J. Backus, in Gaseous Electrical Discharges in Magnetic Fields, 
edited by A. Guthrie and R. K. Wakerling (McGraw-Hill Book 


Company, Inc., New York, 1949), Chap. 11, p. 345, National 
Nuclear Energy Series, Plutonium Project Record, Vol. 5, Div. I. 
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fining the discharge plasma. The discharge electrodes 
were supported in a vacuum chamber (inside dimen- 
sions 11} in.X11} in.X9 in.) which was fixed between 
the poles of a 13 in. pole-face electromagnet. A ballast 
resistor of 30 000 ohms was placed in the cathode line 
to reduce discharge currents to about five milliamperes; 
this stabilized the discharge and prolonged the life of 
cathodes to the order of 15 hours. Dependent upon the 
gas in use, the discharge operated with approximately 
1600 volts on the cathodes at a pressure of one-half to 
four microns in a magnetic field of 1000 gauss. All gases 
used were commercially available tank gas of 99.2 per- 
cent purity which was adequate to meet the require- 
ments of this experiment. Currents and voltages were 
measured by calibrated standard commercial meters. 
Base pressure of 0.05 micron was maintained by use of 
a DPI, model 201, oil diffusion pump and pressure was 
measured by an RCA 1949 ion gauge. 


B. Ion Beam 


The P.I.G. discharge was used to supply a beam of 
ions with uniform energy by means of a 0.115-in. hole 
in one of the cathodes; the ions issuing from this hole 
were accelerated through the 0.250-in. aperture of a 
cylindrical target shield. The target to be sputtered was 
aligned with the beam and supported on an insulated 
wire which passed through the top of the shield. The 
shield performed the function of returning secondary 
electrons to the target by being biased 150 volts nega- 
tive with respect to the sample being bombarded. 
Return of secondary electrons to the target could be 
observed as variations in the target current by varying 
the potential between shield and target from plus to 
minus polarity. The biasing voltage between shield and 
target was assuredly sufficient to prevent escape of 
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Fic. 1. The ion beam electrical circuit. 
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TABLE I. Some ion beam operating conditions. 








P.1.G. P.1.G. 


voltage current 
Gas (microns) (volts) (ma) 


He 4.0 1800 6.0 
Ne 2.0 1500 6.0 
A 0.9 1850 6.0 
Kr 0.5 1850 5.0 


Target Target 
voltage current 


(volts) (ma) 


5700 240 
4000 250 
2400 125 
4200 108 


Pressure 











secondary electrons from the target surface. The targets 
were polished, cleaned in alcohol, then ether and finally 
cleaned by sputtering. The electrical circuit associated 
with the ion source is shown in Fig. 1. 

The above relatively simple arrangement of elec- 
trodes has provided an ion source capable of supplying 
100-220 microamperes of positive-ion current, depend- 
ent upon the gas under study. A few typical sets of 
operating conditions are indicated in Table I. Gas 
pressure in the vacuum chamber was low enough to 
allow gaseous mean free paths of two to nine cm. 


C. Method of Measurement 


Absolute sputtering ratios have been measured for a 
number of gas-metal combinations by essentially count- 
ing the number of ions which have struck the target and 
the number of atoms which have been sputtered. This 
was accomplished by measuring (1) weight loss of the 
bombarded sample, (2) true positive ion current to the 


sample, (3) time of bombardment for ions of known 
initial energy. The absolute sputtering ratio for a metal 
bombarded by ions of known energy, Eo, was given by 


a= 96 500AW/AT4t, 


where J,=positive ion current to the target (micro- 
amperes), ‘= time of bombardment (sec), AW = weight 
loss of target (micrograms), and A=atomic weight of 
material. Gas pressure in the vacuum chamber was 
metered during each test to maintain the ion current 
steady to within five percent. Weight losses of the 
sputtered samples ranged from 500-4500 micrograms 
and were measured by means of a microbalance. The 
data recorded for a series of tests with silver metal 
bombarded by krypton ions is given in Table II as an 
illustration of the method of measurement. Absolute 
sputtering ratio data similar to those in Table II have 
been obtained for the gas-metal combinations: Kr-Ag, 
A-Ag, Ne-Ag, He-Ag, A-Pb, He-Pb, Kr-Cu, A-Cu. 


TABLE IT. Sputtering ratio data for silver bombarded 
by krypton ions, 








‘No. of 
Agatoms No. of Na 
sputtered ions (No. of 
X<10-18 = 10718 atoms/ion) 


Time of 

Target Target bombard- 
voltage current ment AW 
(volts) (ua) (seconds) (ug) 
100 900 646 
96 900 658 
88 1800 1275 

123 900 


1306 
133 900 


1346 
139 900 1605 











A summary of the experimental data is shown in 
Figs. 2, 3, and 4; the curves fitted to the experimental 
points will be explained in the following section. The 
upper limit of ion energy was determined for each gas 
by the onset of intense arcing parallel to the magnetic 
field when high voltage was applied to the accelerating 
shield. The lower limit of ion energy was caused by 
space charge formation within the shield and occurred 
at higher energy for more massive ions such as krypton 
and argon. The concentration of doubly charged ions in 
the beam is believed large enough to introduce an error 
of about four percent. Error in measurement of the ion 
current due to ions rebounding from the target surface 
as ions or metastable atoms is estimated the order of 
two percent or less. The estimated independent 
probable errors are (1) ion current, eight percent, 
(2) ion energy, five percent, (3) weight loss, two per- 
cent, (4) bombardment time, nil. Although there is 
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Fic. 2. Silver series of sputtering ratio vs ion energy 
for Ag-Kr, Ag-A, Ag-Ne, Ag-He. 


considerable scatter in the observed data, it is felt that 
the measurements have a total probable error of about 
ten percent and represent a measure of absolute sputter- 
ing ratios. 


INTERPRETATION OF THE DATA 
A. Silver-Argon Data 


Sputtering of silver metal by argon ions was initially 
studied to compare results with those of Timoshenko."* 


1H. D. Hagstrum has described an instrument capable of 
measuring the net secondary and tertiary currents which arise 
due to reflection of ions and metastable atoms of the noble gases 
from a clean tungsten surface. In private conversation, Dr. 
Hagstrum has kindly given this writer data which indicate that 
the currents due to rebounding ions would be small in the present 
case. His measurements for 1000-volt ions of Het, Ne*, At, and 
Kr* on clean tungsten show the loss of ion current caused by 
reflected ions to be the order of one percent for Het decreasing 
to 0.06 percent for Krt. 

2H. D. Hagstrum, Rev. Sci. Instr. 24, 1122 (1953). 








HIGH-VACUUM SPUTTERING 


As stated in the previous communication, both sets of 
data were in agreement at high energy but it was ob- 
served in this investigation that sputtering ratios de- 
creased more gradually with decreasing ion energy than 
was previously reported. Furthermore, it was noted 
that the number of atoms sputtered per incident ion 
was considerably less than energetically possible.?* For 
the silver-argon case, however, the sputtering ratios 
observed were more nearly the number of collisions 
required by the ion to “‘cool’’ to about 40 ev where the 
silver lattice acts as a moderator and the argon atom 
loses energy by a diffusion-collision process. When 
elementary neutron cooling theory‘ is applied to such 
a process, an ion of atomic weight M, which is being 
cooled by a moderator of atoms of atomic weight M2 
will have energy after m collisions given by 


E,= Eve", 
(M2— M3)? ' (——") 
n : 
2MiM2 M,-—M, 


(1) 


where 





g~t- 


The silver-argon data were fitted by assuming the argon 
ion cannot cause sputtering when its energy is less than 
39 ev; this gives, for the number of ion collisions, 


me= (1/0.59) In(Eo/39), 


which agrees closely with the observed number of 
silver atoms sputtered per incident argon ion. 


B. Radiation Damage 


It would not be expected to find the above method 
to apply in the case of sputtering by light ions since 
the trends of collision number and sputtering ratio are 
inverse with decreasing ion mass. Thus, the application 
of neutron cooling theory is not enough to explain the 
observed data but the success attained in the silver- 
argon case indicated a collision mechanism is involved. 
A more realistic description would surely be attained 
by recognizing that an ion can transfer a considerable 
fraction of its energy to a metallic atom in one collision,”® 
thus forming an energetic recoil atom which can then 
strike other atoms in the lattice to produce secondary, 
tertiary, etc., recoil atoms. A process of this nature 
leads one to consider the effect as being a radiation 
damage phenomenon; consequently, it would appear 
possible to apply the theory of radiation damage, as 
developed by Seitz,?® to the subject of high-vacuum 
sputtering. Seitz has given a method for calculating the 


*3 Since the atomic heat of sublimation at a metallic surface is 
the order of 3-5 ev, a 4000-volt ion possesses sufficient energy to 
remove the order of 1000 atoms as compared to actual sputtering 
ratios of less than 10 atoms per ion. 

4 E. Fermi, Nuclear Physics, notes compiled by Orear, Rosen- 
+ ‘ Schluter (University of Chicago Press, Chicago, 1950), 
p. 181. 

* For example, a 4000-ev argon ion will transfer, on the average, 
1580 ev to a silver atom; this energy is considerably greater than 
the displacement energy of about 21 ev. 

6 F, Seitz, Discussions Faraday Soc. 5, 271 (1949). 
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Fic. 3. Copper-krypton and copper-argon sputtering 
ratios vs ion energy. 


number of displaced atoms produced in a solid due to 
passage of an energetic particle (alpha, proton, neutron) 
of high initial energy. The effect of energy transfer to 
electrons was considered as well as energy loss due to 
“knock-on” collisions. At lower energies, however, the 
chief process of loss is energy transfer by “hard” 
collisions. Therefore, in the case of ions at least as 
massive as helium ions and of energy less than 6000 ev, 
it is felt that ion-atom collisions in a conductor can be 
considered hard-sphere collisions. Employing the as- 
sumption of hard-sphere collisions, we can calculate the 
number of atoms displaced at a metallic surface due to 
an incident gas ion. This number should be related to 
the number of atoms sputtered when account is taken 
of the absorption of displaced atoms which were formed 
a few atomic layers beneath the surface. 

In the immediately following treatment, it will be 
assumed that the ion is more massive than the metallic 
atom and will continue in the forward direction into 
the lattice to remain beneath the surface after the first 
collision, i.e., no rebounding ions or gas atoms are 
present at the surface. 
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Fic. 4, Lead-argon and lead-helium sputtering 
ratios vs ion energy. 
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If we apply the neutron cooling formula, the ion 
energy at the mth collision will be 
E,=Ey", 


and the average fraction of energy transferred to the 


metallic atom will be 

ae «= 2M\M2/(Mi+-M;)’, (2) 
giving 4 
Enyi=cEge-™ (3) 


for the average energy of the (w+1)th metallic recoil 
atom. Using Seitz’ formula for the number of dis- 
placed atoms, ,, produced by a recoil atom of energy E 
in a metal of atomic displacement energy Ea, 


m,= (E/Ea)* (4) 


the number of displaced atoms at the nth collision will 
be n,= (E,/Ea)}, or 


cEy\? 
n= (= em (5) 
Ey 


The total number of displaced atoms is 


4 ne 


> " Mathias? (6) 
n=1 


eFo 
N=)- ,= (= 


Ea 

me= (1/€) In(eo/ Ea). 
We treat absorption of displaced atoms within the 
metal by assuming that of m, displaced atoms formed 


at a depth x below the surface, the number escaping 
from the surface will be 


n= e*n,. (7) 


The depth x will be related, on the average, to the 
number of collisions the ion has made by assuming the 








Fic. 5. Collision diagram. Subscript L denotes velocity in the 
laboratory frame. Primes denote velocity after collision. Sub- 
script 1 or 2 denotes gas or metal atom velocity. 
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ion to progress by a random walk, in which case 


x=ky/n. 


eEo t 
ne= g eva ee. 


d 


Then 


where a=k8, and the number of atoms sputtered is 
given by 


eEo 4 Ne 
n= (= > e7 av ne (n—-DE/2 (10) 


Ea n=l 


As previously stated, Eq. (10) does not include the 
effect of ions which rebound from the surface on the 
first collision as neutral atoms, retaining a considerable 
fraction of their energy and producing only one recoil 
atom (the surface atom which reversed the ion’s direc- 
tion). Energetic recoil ions have been observed pre- 
viously?” and this effect has been studied in the case 
of light ions such as Li*.”8.” In considering the effect of 
ions or atoms rebounding at the surface, the following 
assumptions will be made as a first approximation to 
the case which may exist in nature: (1) The ion or gas 
atom penetrates into the volume of the metal if its 
deflection in the laboratory frame after the first collision 
is less than 90°. (2) The ion or gas atom rebounds from 
the surface if its deflection in the laboratory frame after 
the first collision is greater than 90°. 

If the mass of the incident ion, M,, is less than the 
mass of the metal atom, Me, there will be an angle of 
deflection 4, in the c.m. system such that angles less 
than 4 are penetrating cases and angles greater than 
69 give rise to rebounding ions. It is seen by the accom- 
panying diagram (Fig. 5) that 


costyo= M / M 2) 
and the probability of a penetrating type collision is 
f a 8o/ T. 


The probability of the ion rebounding on the first 
collision is 
fr=1-fo, 


since we consider the ion to either rebound or penetrate 
the surface on the first collision. The surface is also, in 
effect, considered to be both clean and smooth, thus the 
first collision will be an ion-metal atom collision and a 
rebound atom will not strike a crevice but will, with 
certainty, escape from the surface. 

For Ag-Kr, f,=0.22 and for Ag-He, f,=0.49; hence, 
the percentage of rebounding cases is comparable to 
that of the penetrating cases for lighter rare gas ions 

37 Longacre (see reference 28) reflected Li* ions from a nickel 
surface and found that the fraction of energy retained by the 
ions was less with increasing angle of deviation. By analysis of 
his data, he concluded the scattering was due to elastic impacts 
at a roughened surface with small coefficient of restitution. 


8 A. Longacre, Phys. Rev. 46, 407 (1934). 
* R. W. Gurney, Phys. Rev. 32, 467 (1928). 
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incident on silver, a typical series of gas-metal com- 
binations. 

Figure 5 indicates that each rebounding collision will 
result in a metallic recoil atom directed toward the 
volume of the metal. If the current of ions to the metal 
is J at energy Eo, there will be a fraction, f,J, of the 
current which penetrates the surface; these ions will 
be considered to cause sputtering according to Eq. (10). 
In addition, there will be a fraction, f,/, of ions which 
produce metallic recoil atoms at a new average energy 
E, which is essentially a second type of particle incident 
at the surface. Therefore, the current at the surface can 
be considered to consist of two components; (1) incident 
ions in a parallel beam at energy Ep and (2) recoil metal 
atoms incident within a cone of semivertical angle 
(r—6o)/2 and with average energy H,. The energy E, 
is calculated by averaging the maximum recoil energy 
2eZo, and the recoil energy for ion deflection at 4, 
AE=2¢ sin?(@)/2), which gives 


; 1+M,/M; 
E,= cz 14+) ; 


If mg=average number of atoms sputtered per ion, 
Ng? = average number of atoms sputtered per ion which 
penetrates the surface, m,."=average number of atoms 
sputtered per recoil metal atom at the surface, and with 
I= f,1+f, (ions per second), we have 


Na= f pNa? + frna’. 


(11) 


(12) 
By Eq. (10), 


I 4 Ne 
NP? = (= ) ® ea ne (n—-DE/2 
Fa n=1 
By Eqs. (4) and (9), 
na’ = (E,/Ea)*e-* (13) 
as we consider the escape of displaced atoms due to one 
recoil metallic atom. 


Since the magnitude of E, is between 1.5¢H) and 
2eEo, we have the relation 


1.224/ (eo) < /E, < 1.42,/ (eEv) 


VE=1 324/ (Eo) ’ 


which is accurate to at least seven percent for all gas- 
metal combinations. _ 
Thus by Eq. (13) and the approximate value for \/E,, 


Na’ 1.32 (€Eo/ Ea) tema; (14) 
and by (12) and (14), 


and 


eo 4 Ne 
m= (=) x (fot 1.325 nif, e~arv tee? 


n=l 


Fa 
(15) 
0 n¥1 
me= (1/£) In(eEo/Ea),  Sbur= 
1 #=1. 


Equation (15) reduces to Eq. (10) if Mi> Mo. 
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TABLE III. Parameters for sputtering ratio formula. 








Gas-metal 
combination Fa € 





0.39 
0.27 
0.069 
0.49 
0.47 
0.27 
0.037 


0.59 
0.33 
0.073 
0.92 
0.84 
0.34 
0.038 








In order to test Eq. (15), one should be able to fit the 
curve of mq vs Eo calculated by this equation to the 
observed experimental data by a suitable choice of the 
parameter a. Ez, « and £ are constants characteristic of 
each gas-metal combination. The displacement energy 
for silver and copper is reported 20-25 ev but it has 
not been measured for lead. Fortunately, the uncer- 
tainty in the displacement energy for the sputtering 
process can be removed by means of sputtering thresh- 
old data for silver, copper, and lead bombarded by 
mercury ions due to the work of Wehner.’ The sputter- 
ing threshold, according to formula (10), is E,=Ea/e; 
using Wehner’s threshold data E,(Ag-Hg) =40-50 ev, 
E,(Cu-Hg) = 50-70 ev, E,(Pb-Hg) = 20-40 ev and values 
of « calculated by Eq. (2), the displacement energies 
for silver, copper, and lead are about 21, 21, and 15 ev 
respectively. A fair fit to the experimental data has been 
obtained in the cases of sputtering for the gas-metal 
combinations and parameters, a (Table III), as shown 
in Figs. 2, 3, and 4. The plot of mz vs Eo according to 
Eq. (15) is shaped over most of the energy range by 
the term 4/eEo which is proportional to the average 
recoil momentum of the first metal atom struck at the 
surface. The summation factor serves to scale the curves 
by means of the parameter a. It is to be noted that all 
of the quantities involved in Eq. (15) are consistent, 
as we have: (1) «, &, fp, f, and m- (rounded off to the 
nearest integer) are calculated by classical collision 
theory. (2) Ey as calculated by Ez,=E:Xe using 
Wehner’s sputtering threshold data for Ag and Cu 
bombarded by Hg* ions is in agreement with previous 
direct measurements of this constant. (3) Values of a 
the order of one indicate that about one-third of the 
atoms displaced at the surface on the first collision 
escape as sputtered atoms (the equation would be 
exceedingly questionable if the a’s were an order of 
magnitude lower or higher). 

One consequence of a description of sputtering based 
on a collision mechanism is the possibility of relating 
“equivalent ion energies” for a pair of ions on a common 
metal. Since the number of atoms sputtered is deter- 
mined largely by the energy of the first few metallic 
recoil atoms, it would seem, when the average energy 
transferred on the first collision to a metal atom, type 1, 
by an ion, type 2, at energy EF; is the same as the energy 
transferred by an ion, type 3, at energy E;, the sputter- 
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‘IG. 6. Equivalent energy shift of helium sputtering 
ratio curve relative to argon on lead. 


ing ratios should be approximately equal for the two 
ions. That is, sputtering of a metal by two types of 
ions should tend to be the same if the ion energies are 


related by 
€12E2= €13Es, (11) 


where €12, €13 are given by Eq. (2). Application of Eq. 
(11) to the Pb-He, Pb-A curves is shown in Fig. 6 and 
the curves for Het, Net, and At ions shifted relative to 
krypton on silver metal are shown in Fig. 7. The curves 
indicate the expected tendency of merging under equiva- 
lent energy shifts, consequently, for a given sputtering 
ratio, the energy differences observed experimentally 
are considerably reduced by this transformation. It is 
to be noted that the observed sputtering ratios in the 
copper-argon and copper-krypton cases have a common 
trend which is expected in view of the similarity of the 
collision parameters ¢, and é, for these two cases. 


DISCUSSION 
A. “Hard’’ Collisions 


It is generally recognized that ions will not transfer 
appreciable energy to the atomic electrons of a solid or 
a gas if the velocity of the ion is small compared with 
the velocity of the electrons. Thus, at low ion energy, 
the chief mechanism of energy loss for an ion will be 
nuclear or “hard” collisions which will produce recoil 
atoms in the stopping material. At energy less than 
5000 ev, a helium ion will have velocity small compared 
to a five-volt electron, therefore, it appears safe to 
assume that energy transfer to bound electrons by ions 
of Het, Net, At, or Kr* will be negligible at ion energies 
considered in this study. Transfer of energy to con- 
duction electrons within a solid, however, is possible at 
all ion energies ranging from zero to ¢, the Fermi 
energy. Nevertheless, such transfer of energy to con- 
duction electrons is similarly an unlikely process be- 
cause: (1) Transfer of energy to electrons at the lower 


levels of the Fermi sea is forbidden by a combination of 
the exclusion principle and the small energy transfer 
which is allowed on a classical basis for an ion-electron 
collision. (2) By (1), the only electrons which can 
receive energy from ions are those near the top of the 
Fermi sea and these will have velocity large compared 
with that of the ions. In the case of Het, however, 
conditions are marginal and a calculation by Seitz’ 
method, for helium incident on silver indicates that 
energy transfer to conduction electrons may become 
effective at about 4000 ev. This effect may cause 
sputtering ratios, at higher energies than those con- 
sidered in this experiment, to show a maximum with 
increasing ion energy which would be detectable at the 
lowest energies for ions of helium or hydrogen. 


B. Sputtering Thresholds 


It was not possible to measure sputtering threshold 
energy by means of the apparatus used in this experi- 
ment. Nevertheless, it is of interest to speculate on this 
aspect of the problem in light of the previous considera- 
tions related to a collision process. Equation (10) 
predicts m,=0 at energy E;=E,4/e which should be an 
upper limit to the sputtering threshold, E;, since an 
ion with sufficient energy to produce displacements at 
the surface should also be capable of releasing sput- 
tered atoms. A lower bound would be E,/2e, where 
E,=atomic heat of vaporization. Then £;, for normal 
incidence, satisfies the relation 


E1/2eg LE: Ei/e. (12) 


A comparison between E,/2e, Ey/e and sputtering 
thresholds observed by Kingdon and Langmuir and 
the General Electric Company, Ltd., is shown in 
Table IV. This table indicates the tendency of observed 
sputtering thresholds to lie within the bounds esti- 
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Fic. 7. Equivalent energy shift of helium, neon and argon 
sputtering ratio curves relative to krypton on silver. 
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mated by a collision process. The case is more likely 
described by a tendency for the sputtering ratio curve, 
Ng, 1S Eo, to break away from Eq. (10) at energy near 
E,/¢ and to continue until it intersects the ion energy 
axis at some intermediate threshold energy. 


C. Theory and Observation 


The application of radiation damage theory to the 
problem of sputtering leads to considerations which 
justify its use in agreement with many of the results 
reported by previous investigators in the field. There 
has been some doubt as to the nature of the sputtered 
material, some opinion being in favor of an atomic 
' process and others believing the sputtered material to 
be in the form of “chunks” or globules. This writer 
has detected globules of material to be removed from 
the cathode of a P.I.G. discharge when the operation of 
the discharge (cathode voltage= 2500 v, pressure=0.75 
nicron in air) was such that it was in a transitory state 
between self-sustaining glow and cut off with a fre- 
quency of about two per second. However, when the 
discharge was running in the steady glow condition, 
with ample supply of gas, it was always observed that 
the sputtered material was vaporous and deposited as 
an even film on a collector plate facing one cathode. 
It is felt by this writer that an atomic basis for sputter- 
ing has been established in view of the following evi- 
dence: (1) Sputtering ratios are small numbers the 
same order of magnitude for all gas-metal combinations 
and, in many cases can be fitted by the sputtering ratio 
formula of Eq. (15). (2) The equivalent ion shift indi- 
cates an additional relation between atomic masses. 
(3) Sputtered films have the same general properties as 
evaporated films. (4) Many observers report spectral 
lines of the cathode material at the cathode of a glow 
discharge. (5) The time an ion is effective over a volume 
at a particular region of the cathode surface is extremely 
small compared to the time interval between collisions 
at the region. 

The results of Seeliger and Sommermeyer are con- 
sistent with the radiation damage concept since a 
directed beam of atoms would not be expected to pro- 
duce sputtering in a “preferred” direction due to the 
multiple collisions occurring at the surface. Electrons, 
of course, would not be expected to cause sputtering at 
low energy since (1) they transfer a major portion of 


TABLE IV. Sputtering thresholds. 








A. For tungsten observed by General Electric Company, Ltd. 
Et E1/2e Ea/e 
Gas (observed) (ev) (ev) 





A 85 25 15 153 
Hg 50 50 8.8 90 
He begins ~700 206 2112 
He begins ~350 106 1081 





B. For thoriated tungsten observed by Kingdon and Langmuir. 
E: (calcu- 
E: lated by 
(observed) Kingdon and Ex1/2e Ea/eé 
(ev) Langmuir) (ev) (ev) 





527 260 2664 
30.0 306 
LES 180 
9.5 97.1 
8.9 90.5 
132 1350 








® FE, =8.8 ev. 
b Ea =45 ev (from Wehner's data, using E:(W-Hg) =90 v). 
¢ Ea=45 ev (assumed the same as tungsten). 


their energy to conduction electrons and bound elec- 
trons (as evidenced by heating and production of 
secondary electrons) and (2) their mass is too small to 
transfer appreciable energy to a metallic atom in one 
collision. The only temperature dependence of sputter- 
ing should be derived from a possible reduction in the 
energy required to remove an atom from the surface 
of the metal. This may cause a slight temperature de- 
pendence, but, as has been frequently observed, the 
temperature dependence, allowing for vaporization, 
should not be very strong. 

It is believed the concepts of radiation damage at a 
metallic surface due to high-energy ion bombardment 
have been useful in explaining the form of absolute 
sputtering ratio curves and are in agreement with many 
of the known observations regarding sputtering. If the 
views expressed in this paper are correct, it is hoped 
that future work will continue to support these concepts 
and lead to a better understanding of the sputtering 
phenomenon. 
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From measurements of the Hall effect and resistivity on single 
crystals of PbS, the mobility of electrons and holes has been found 
over the temperature range 77-600°K. The crystals had donor 
or acceptor concentrations in the range 10!*-10'%/cm’. These 
data are used to critically examine theoretical work on the mo- 
bility in polar crystals and to obtain estimates of the effective 
masses of electrons and holes. The perturbation theory of Fréhlich 
and Mott (F-M), and Howarth and Sondheimer (H-S) for the 
scattering of electrons by the polar (optical) modes of the lattice 
vibration is first considered. The expansion parameter of the per- 
turbation theory is evaluated as 0.28 (m,*/m,.)* from published 
data on PbS; m, is the free and m,* is the effective electron mass. 
The polaron theory of mobility of Low and Pines, which has been 
developed to replace the perturbation theory when the expansion 
parameter approaches or exceeds unity, is then discussed. m,*/m. 
is the only unknown parameter in the two theories. The mobility 
data is first compared with the F-M, H-S theory from which 
me*/m,.=0.33, ma*/m.=0.36. These effective masses, when sub- 
stituted in the Hall equation, provide for reasonable agreement 
with the Hall data. However, at high and low temperatures, dis- 


crepancy exists between the theoretical and experimental mobility 
curves. Comparison of the mobility data with the polaron theory 
shows that the polaron theory is nearly identical with the F-M, 
H-S theory at low temperatures. The polaron theory has not been 
completed for the high temperature region. 

Since the data does not indicate the presence of impurity scat- 
tering, an analysis is made combining polar (F-M, H-S theory) 
and acoustical scattering, which yields m,.*/m,.=0.22 and m,*/m, 
=0.10. These effective masses give theoretical Hall curves in 
reasonably good agreement with the data. While the theoretical 
and experimental mobility curves now agree at low and inter- 
mediate temperatures, discrepancy still exists at high tempera- 
tures. Possible reasons for the discrepancy are discussed. It is 
concluded that the most likely source of error lies in perturbation 
theory expressions for the scattering cross sections for electron 
energies greater than that corresponding to the frequency of the 
polar vibration. Comparison of the high-temperature data with 
the polaron theory awaits further development of the theory and 
should provide a sensitive test of the general polaron theory, as 
well as being of interest in the theory of mobility in polar crystals. 





I. INTRODUCTION 


EASUREMENTS of the mobility of electrons 

and holes in crystals provide a means of study- 
ing basic scattering mechanisms in solids. When the 
mobility dependence is the same in crystals with dif- 
ferent concentrations of donor and acceptor centers, 
the scattering can be attributed primarily to an inter- 
action between electrons and lattice. Mobility studies 
then provide a basic method for investigating this 
fundamental aspect of solids. 

Considerable study has been made of the mobility 
in nonpolar crystals such as germanium and silicon, 
and generally good agreement exists between theory 
and experiment. The classical polar crystals, the alkali 
halides, have low-electron conductivity with the result 
that mobility measurements! and their interpretation 
are complicated by ionic conductivity, polarization 
effects, and noise. Because of these difficulties, the 
simple dc Hall effect-resistivity method of measuring 
mobility has not been used. 

Lead sulfide is a semiconducting polar crystal with 
high-electron conductivity at room temperature. Con- 
ventional Hall-resistivity methods for measuring mo- 
bility can be used with accuracy similar to that ob- 
tained on germanium. PbS therefore provides a useful 
material on which to study mobility in polar crystals. 

Previous experiments? on PbS and other lead com- 
pounds indicate that the mobility can be represented 


1A. J. Redfield, Phys. Rev. 94, 537 (1954); F. C. Brown, 
Phys. Rev. 92, 502 (1953); J. R. Haynes and W. Shockley, Phys. 
Rev. 82, 935 (1951). 

? An excellent review of the work on PbS and other lead com- 
poate 3) 1953 is given by R. A. Smith, Advances in Physics 


by an equation of the form u=o7~*”?, over the tem- 
perature range 100°K-700°K. No theoretical basis for 
such a variation has been given. 

A recent advance’ in the method of preparation of 
crystals of PbS has provided homogeneous single 
crystals of m and p types over a wide range of impurity 
concentrations, including high purity material. In this 
paper, we use the Hall-resistivity data’ obtained on 
these crystals to calculate mobility curves. This mo- 
bility data, in addition to providing basic experimental 
information concerning the motion of electrons and 
holes in PbS, is used as a basis for discussing, and in a 
sense evaluating, theoretical work on mobility in polar 
crystals. 

Mobility studies have further interest as a method of 
measuring the effective mass of electrons and holes 
since the effective mass is the only unknown parameter 
in the polar mobility equation. In contrast, the acousti- 
cal scattering theory has the wave function and the 
effective mass in the mobility equation. 


II. DISCUSSION OF THEORETICAL STUDIES OF 
THE MOBILITY IN POLAR CRYSTALS 

In a crystal, conduction electrons are scattered at 
low temperatures by lattice defects and impurity atoms. 
At higher temperatures scattering due to thermal 
vibrations of the ions becomes significant. In ionic 
crystals these vibrations may be resolved into two 
general types. In one, called the acoustical mode, the 
positive and negative ions move in the same direction 
so that there is very little polarization effect. In the 
other, called the polar or optical mode, positive and 


3 R. F. Brebrick and W. W. Scanlon, Phys. Rev. 96, 598 (1954). 
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negative ions move in opposite directions and create 
polarization fields. When the polar modes are excited, 
they are expected to be more effective in scattering 
) electrons than the acoustical or impurity mechanisms. 

The first studies of the scattering of electrons by the 
polar modes were made by Froéhlich and Mott‘ and by 
+ Davydov and Shmushkevitch® from a quantum me- 
chanical point of view, and by Seeger and Teller® by a 
classical method. The results at absolute zero were 
essentially the same; Seeger and Teller did not discuss 
the temperature dependence. Fréhlich and Mott used 
the perturbation method for treating the scattering of 
electrons by polar modes. 

The perturbation theory is based on a power series 
expansion in terms of the parameter, a, which is defined 
in Eq. (1) below. @ is a measure of the strength of the 
interaction of the electrons with the polar modes’:* and 
should be less than unity for the perturbation theory 
| to be applicable. It should be much less than unity for 
rapid convergence of the theory. 

_ The polaron theory of mobility has recently been 
developed® to replace the perturbation theory when 
/ a approaches or exceeds unity. 

To see if the perturbation theory is applicable in 
PbS we calculate a: 


“GG 


e is the static dielectric constant, ¢) is the high-fre- 
| quency or electronic-dielectric constant, w; is the angu- 

lar frequency of the longitudinal polar modes, e is the 
j electronic charge, m, is the free electron mass, m,* is 
the effective electron mass, h=h/2z, and h is Planck’s 
constant. For PbS we have: 


e=17.9, e9=15.3,9 
Q= (€/€0) kw," 


w;= infrared reststrahlung angular 
frequency = 2mc/A, 


(2) 


\z=80 microns." 


Substituting these into Eq. (1) we find a=0.28 (m,*/ 
m.)*, For nominal values of (m,*/m.), « is only slightly 
less than unity. Thus we will compare the data with 
both the perturbation and the polaron theory, the 
former at present being further developed than the 
latter. 


‘H. Fréhlich, Proc. Roy. Soc. (London) A160, 280 (1937); 
oon and N. F. Mott, Proc. Roy. Soc. (London) Al71, 496 
5B. Davydov and I. Shmushkevitch, J. Phys. U.S.S.R. 3, 

359 (1940). 

- ie and E. Teller, Phys. Rev. 54, 515 (1938); 56, 
‘H. Frdhlich, Pelzer, and Zienau, Phil. Mag. 41, 221 (1950). 
°F. Low and D. Pines, Phys. Rev. 91, 193 1953 3). 
® International Critical Tables. 

1 Lyddane, Sachs, and Teller, Phys. Rev. 59, 673 (1941). 
"J. Strong, Phys. Rev. 38, 1818 (1931). 


IN PbS 1621 
It is of interest to note that in PbS a is considerably 
less than in the alkali halides, where a is of the order 
3-6. The main reason for this is that the high electronic 
dielectric constant in PbS weakens the electron-lattice 
interaction and reduces a. This is better shown by com- 
bining Eq. (1) with @~eo+e;, where e; is the contribu- 
tion to the dielectric constant from the ion motion: 


a x €,/[ €o(eot+e,;) FXe:/ec’. (3) 


Equation (3) shows that ¢9 reduces the electron-lattice 
interaction essentially as 1/e?, which agrees with the 
notion that the electronic polarization is rapid enough 
to follow the motion of conduction electrons. By a 
similar argument we show below that the mobility is 
proportional to ¢o. Thus it is to be expected that in 
polar crystals of different values of ¢€, the mobility 
will be roughly proportional to ¢o’. Thus high electronic 
dielectric constants characterize crystals with low in- 
trinsic energy gaps” and high mobilities. 


A. Perturbation Theory of Frélich-Mott, 
Howarth-Sondheimer 


The solution for the mobility as a function of tem- 
perature is normally made in two steps. The first is to 
find an expression for the scattering cross section for an 
electron of a fixed energy. The second is to make a sta- 
tistical average over the thermal distribution of elec- 
tron energies. The solution of the Boltzmann transport 
equation provides the most accurate method of per- 
forming the statistical average. 

An approximate solution for the statistical part of 
the problem was obtained by Frohlich and Mott‘ by 
the use of a special definition of the relaxation time, 
without an attempt to solve the Boltzmann equation. 
Recently Howarth and Sondheimer have solved the 
Boltzmann equation and have obtained an expression 
for the mobility as a function of temperature. They 
used the scattering cross sections of Fréhlich-Mott in 
the Boltzmann equation. We therefore will call this the 
F-M, H-S theory. 

The original Frdéhlich-Mott result contained an error 
in the way the polarizability of the ions appears in the 
mobility equation. This has since been corrected’! 
but Howarth and Sondheimer’s result does not include 
the correction. After briefly outlining their solution we 
will make this correction. 

The solution of the Boltzmann equation for non- 
degenerate polar semiconductors is complicated because 
the energy exchanged in a collision is large relative to 
the electron energy. Thus, one cannot use the approxi- 
mation of neglecting the energy exchanged in a collision 
to solve the Boltzmann equation. However, since the 
optical modes can, in good approximation, be con- 


2T, S. Moss, pagal in the Elements (Academic 
Press, New York, 1952), p. 61. 
13 ED), Howarth "and E. oh Proc. Roy. Soc. (London) 
A219, 53 (1953). 
“HH. Callen, Phys. Rev. 76, 1394 (1949). 
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Fic. 1. Theoretical curve as given in reference 13, 
Curve 2 of Fig. 1. 


sidered to be of a single frequency no integration over 
lattice frequencies is required. A finite difference equa- 
tion results in place of the integral equation for metals. 
H-S employ the variational method introduced by 
Kohler to obtain a solution for the whole range of 
temperature. The formula is expressed as ratios of 
infinite determinants and may be evaluated to any 
degree of accuracy by breaking off the determinants at 
a finite number of rows and columns. Closed form ex- 
pressions are given for the limits, T— 0 and T—. 
The F-M, H-S expression" for the conductivity o is: 


16a°M w)k?T?(e*#—1)X(z)e#//*7 
o= ; (4) 
21Q?3h? 


where X(z) is shown in Fig. 1, which is Curve 2 of 
Fig. 1, reference 13; z=0/T, 0=hw,/k=194°K, k is 
Boltzmann’s constant, a is the lattice spacing between 
ions, M is the reduced mass of the ions, and E; is the 
Fermi energy. Q is the ionic charge and is approxi- 
mately 2e, but is not known exactly without a knowl- 
edge of the amount of homopolar binding in the crystal. 
Equation (4) is correct when the ions are not polariza- 
able. When ¢)>1 the necessary modification is to re- 
place Q by Q/eo. The physical basis of this is that the 
electronic polarization of the ions is rapid enough to 
follow the motion of the conduction electrons. There- 
fore the energy of interaction of an electron at a dis- 
tance r from an ion is eQ/eor instead of eQ/r. Thus Q 
should be replaced by Q/¢o in Eq. (4). We have then: 


o=[(@Mened?/(21Q") ]f(2), (5) 


where we have let (16%?7?/3h) (e*—1)x(z)e*//*7 = f(z). 
Equation (5) can be used in this form, but it is useful 
to eliminate Q since it is not precisely known. This can 
be done with the Born" relation: 


P= (e— 6) a®@Mw?/2r, (6) 


15M. Kohler, Z. Physik 124, 772 (1948); A. H. Wilson, The 
Theory of Metals (Cambridge University Press, Cambridge, 1953), 
second edition, Chap. 10. 

16M. Born and M. Géppert-Mayer, Handbuch der Physik 24, 
2, 646 (1933). 
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Finally, it is convenient to substitute the Lyddane- 
Sachs-Teller relation,” (w,/w;)?=«/eo [Eq. (2)] into 
Eq. (7): 


€€9 


o-|—" lr, (8) 


w1(€— €0) 
From Eq. (8) and e& eo+¢;, we conclude that 
o & €9(€o+€;:)/€: = ec?/e;, (9) 


in agreement with our discussion above. 

For comparison with experimental data it will be 
convenient to express Eq. (8) in terms of the mobility, 
u. To do this it is first necessary to determine whether 
Fermi-Dirac or Boltzmann statistics are applicable. 
The degeneracy temperature is given by 


Op= (h?/2km.) (322N p) i (m./m.*), 


where V p=density of donor atoms. 

N p is calculated from the Hall data in the exhaustion 
region and is listed in Fig. 2 for the samples studied. 
Unless m./m,* is abnormally large, these calculations § 
show that 6p is low enough that Boltzmann statistics 
can be employed for all but possibly one of the samples 
studied. Further comments on this will be made after 
evaluating m,/m,* from the mobility data. 

Assuming Boltzmann statistics, we have for the 
density of electrons in the conduction band: 


n= 2(2rm kT /h?)%e®s!*T (m*/m.)3. (11) 


The zero of energy is at the bottom of the conduction 
band. Substituting Eq. (11) into Eq. (8) and using 
o=euen we find: 


(10) 


(F-M, Hs)=2” 
Mel E-M, Mh-5) = —— ay, 
3 (24m_.k6)* 


(23) 


where ao=h?/m-e”. Substituting the published data for 
PbS and converting to practical units, we have 
Me= 192(m./m,*) 3X (z) (e?—1)/zt cm?/volt-sec. (13) 


Closed-form expressions for x(z) at low and high 
temperatures are 
X(z)=1, 21; 


X(z)=8(rz)!, 2>1. 


(12) 


(14) 


B. The Polaron Theory of Mobility 


Whereas the perturbation theory of mobility uses 
the electron as the basic unit in a scattering process, 
the polaron theory uses the polaron as the basic unit. 
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Fic. 2. Experimental data for mobility as calculated from Eq. (18). 


A polaron’:*!7.18 is defined as an electron plus the 
polarization it induces in the lattice. A good approxima- 
tion for the effective mass of the polaron, mp, is!® 


=m.*(1+a/6), (15) 


Low and Pines*® have used polaron wave functions 
in place of electron wave functions to calculate the 
mobility at low temperatures. Their result for T<8, 
with the 2m correction noted by Redfield! included, is 


aa 


ce) a 


where f(a) is a slowly varying function of a which may 
be taken to be 5/4 for 3<a<6 and f(a)=1 in the 
limit a — 0. 

For comparison with the F-M,H-S result at low 
temperatures we substitute Eq. (14) into Eq. (12) and 
compare with’ Eq. (16): 


e(L-P) = (m.*/my)*f(a)ue(F-M, H-S), T/0&K1. (17) 


17S. Pekar, J. Phys. (U.S.S.R.) 10, 341-347 (1946); J. Markham 
and F. Seitz, Phys. Rev. 74, 1014 (1948); T. Landau and S. Pekar, 
J. Exptl. Theoret. Phys. (U.S.S.R.) 18, 419 (1948); S. Pekar, J. 
Exptl. 9 Phys. (U.S.S.R.) 19, 796 (1949). 
Low, and Pines, Phys. Rev. 90, 297 (1953). 


when a<6. 


If a is small there is little difference between these 
theories at low temperatures. 

The high-temperature mobility formula for the 
polaron theory has not yet been worked out because of 
the complicated nature of the expressions® for the 
scattering cross sections. 


III. CALCULATION OF MOBILITY FROM THE 
HALL-RESISTIVITY DATA 


We now describe the analysis of mobility data ob- 
tained on a group of p-type and -type single crystals 
of PbS in which the impurity concentration covers the 
range from about 10!*/cc to 10'*/cc. 

From the analysis of a one-carrier semiconductor the 
mobility is given by the expression: 


u=8R/3xp cm?/volt-sec, (18) 


where R is the Hall coefficient and p is the resistivity. 
At low temperatures, where impurity conductivity 
predominates, this expression is satisfactory. Figure 2 
shows the mobility as calculated from Eq. (18) with 
the Hall-resistivity data of reference 3. The mobility 
in the impurity range of conductivity is independent of 
the concentration of impurity centers and therefore 
appears to result from lattice scattering. 

In the high-temperature range, electron and hole 
mobilities can be found from the data by use of the two 





R. LL. PETRITZ AND W..W. SCANLON 








@) n-type Ib () p-type 20b - 


Fic. 3. Experimental data 
for single carrier mobilities as 
calculated from Eq. (20). 
Curve A is the theoretical 

lar mobility of F-M, H-S, 

. (13). The — mobility, 
Eas. (16) and (17), has 
e shape as Curve A in Z 
en T <@. No results have S 
been published for the region sols 
T >6. Curve B is the combined 
theoretical polar (F-M, HS ff /( 









































































theory) and acoustical mobility trer 
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carrier formulas for Hall and resistivity as follows: By fitting the theoretical curve to the data for n-type fv. . 
material as shown in Fig. 3a we find m,*/m,=0.33. 

_ Sap nci—p Similarly for holes the data is shown in Fig. 3b and we FT) 

a (wo | find m,*/m,=0.36. Substituting m.*/m,.=0.33 into & stud 


Eqs. (2) and (15) we find a=0.16 and m,/m,*=1.03. F&F depe 
1/p=(eu./c)(nc+p), Substituting m,*/m.=0.33 into Eq. (10), we conclude § pera 
n=p+N p (n-type), that the use of Boltzmann statistics is permissible § bilit 
since 0p = 62°K. erty 

where c=u./un, m=concentration of electrons, and The poor agreement between theory and experiment § mect 
p=concentration of holes. Solving for the electron (Fig. 3a) as a function of temperature raises the ques- Jf cal \ 
mobility, one obtains tion as to the applicability of the perturbation theory § inher 
to PbS and what significance should be placed on the expe 

_ a <—" 4cR i values obtained for the effective masses. A check on § than 

hi -(14 bye me | (20) this is provided by the high temperature Hall data. § lattic 

We have calculated Hall curves using Eqs. (11) and § cond 

where Rr=—3x/8eN p is the Hall coefficient in the (19) and c=1.4, (reference 3), Zo=0.37 ev, (reference  indic 
exhaustion region. 19), m.*/m.=0.33, and m,*/m,.=0.36 and find that ff high 
The electron mobility data as calculated from Eq. these curves fit the experimental Hall curves reasonably At | 
(20) are shown in Fig. 3. The corresponding mean free _ well. The spread in the mobility data over the samples ff defec 
path was calculated by using studied is such that these values for effective masses ff to thi 
have an accuracy not greater than +25 percent. from 

I= (3u/4e) (2emkT)}, (21) With a=0.16, the perturbation theory can be ex- Blatter 

and is shown in Fig. 4. Corresponding curves for holes ected to have some validity in PbS. Improved agree- § there’ 
are also shown in Figs. 3 and 4. ment between theory and experiment may be expected §§ polar 
by bringing into the analysis other scattering mecha- Bifurth 

IV. COMPARISON OF THEORY AND EXPERIMENT _ nisms. This is particularly the case at T<@ since the Sin 
A. F-M, H-S Theory polar modes are being frozen out. We — this essent 

possibility in Sec. V. since 

Curve A on Fig. 3a is a plot of Eq. (13). Since @ is ——__— theor 
fixed the only unknown parameter of Eq. (13) ism,/m.*. _™ W. W. Scanlon, Phys. Rev. 92, 1573 (1953). the F. 
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B. Polaron Theory 


Since the temperature dependence of the polaron 
theory for T<@ is the same as that of the perturbation 
theory [Eq. (17)], we can consider Curve A on Fig. 3a 
as representing the polaron theory for 7<@. We obtain 
an equation for m,*/m, directly from the polaron theory 
by combining Eqs. (1), (15), and (17) with Curve A of 
Fig. 3a: 





(= ' f(a) 


= (0.33)%, (22) 
me! [1+ (0.28/6)(m.*/m.)*? 


Since a is a function of m,*/m,., Eq. (22) can be 
solved for m,*/m,. However, the functional form of 
f(a) has not been published so we consider its two ex- 
treme values. When f(a— 0)=1 the solution of Eq. 
(22) is m.*/m.=0.34 and from Eq. (1) and (15) we 
find a=0.16 and m,/m,*=1.03. 

Considering the other limit, f(3<a<6)=5/4, we 
find m.*/m.=0.29, a=0.15, and m,/m.*=1.025. From 
the close agreement of these two sets of results and 
those of the perturbation theory, we conclude that the 
perturbation and polaron theories are nearly identical 
in the low-temperature region. 


V. ANALYSIS BASED ON POLAR AND ACOUSTICAL 
MODE SCATTERING 

The mobility curves, Fig. 2, for the samples of PbS 
studied are very similar—there being no noticeable 
dependence on impurity concentration over the tem- 
perature range examined. We conclude that the mo- 
bility in the range of temperature is an intrinsic prop- 
erty of PbS. There are at least two intrinsic scattering 
mechanisms not yet considered in our analysis, acousti- 
cal vibrations of the lattice and those lattice defects 
inherent at thermal equilibrium. It is reasonable to 
expect that scattering in polar semiconductors, other 
than that arising from the polar vibrations of the 
lattice, will be similar to scattering in nonpolar semi- 
conductors. Experiments on nonpolar semiconductors 
indicate that scattering by lattice defects is obscured at 
high temperatures by that from the acoustical modes. 
At low temperatures the density of intrinsic lattice 
defects in PbS can be expected to be small compared 
to the density of impurity atoms and/or defects arising 
from deviations from stoichiometry. Neither of these 
latter effects appear in our mobility data, Fig. 2. We 
therefore conclude that only the combined effects of 
polar and acoustical modes need be considered in the 
further analysis of the mobility data. 

Since the perturbation and polaron theories are 
essentially the same at low temperatures in PbS, and 
since no results have been published for the polaron 
theory for 7 >6, we confine this analysis to combining 
the F-M, H-S polar theory with acoustical theory. 
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Fic. 4. Experimental reciprocal mean free path, Eq. (21). 


The dependence of mobility on temperature for 
acoustical mode scattering” has been shown to be 


wa=BT-=Dz2!. (23) 


A rigorous method for combining the effect of two 
or more scattering mechanisms must recognize the ve- 
locity dependence of the mean free path of the electron 
for each scattering mechanism concerned. For the 
acoustical modes the mean free path is independent of 
velocity while for the optical modes it is proportional 
to the velocity. Because of this dependence on velocity 
it is not strictly correct to obtain a resultant mobility 
for the two scattering mechanisms by adding reciprocal 
mobilities. However, we will employ this approximation 
in our analysis for reasons of simplicity. This approxi- 
mation has been used with good results in germanium 
to combine acoustical and impurity scattering. Com- 
bining reciprocal mobilities from Eqs. (13) and (23), 
we have 

1 zt 
= 5 


a 192(m./m.*)}(e7— 1)x(2) Dz? 





With @ fixed at 194°C the high-temperature region 
cannot be accurately fitted by this approximation. 
A reasonable fit at low and intermediate temperatures 
is shown by Curve B, Fig. 3a. From this we find 
m.*/m-=0.22 and B=5.7X10°. A similar analysis for 
p-type material leads to m,*/m,=0.1 and B=3.1X 10°. 

These values for m,.*/m, and m,*/m,, when used in 
the calculation of the Hall curves, also provide reason- 
ably good agreement with experiment. However, the 

”C. Kittel, Introduction to Solid State Physics (John Wiley 


and Sons, Inc., New York, 1953), p. 277 for references to the 
original work, 
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additional parameter, B, is somewhat arbitrary be- 
cause only a small part of the data extends into the 
T-! region. This reflects uncertainty in the value of 
m,*/m,. Therefore, these values of effective masses are 
not significantly more accurate than those obtained in 
the polar mode analysis. Further data at temperatures 
below 77°K should improve the accuracy of these 
results. 

As an additional check on the inclusion of the acous- 
tical modes we compare our B’s with those of non-polar 


semiconductors.” The values of B listed vary from 


5X 10° to 9X 108, which include our values. 


VI. DISCUSSION 


The combined theory has served to remove the dis- 
crepancy with experiment at T<@; the acoustical 
mode scattering provides an explanation of the ap- 
proach to 7-! behavior. Discrepancy still exists for 
T>06. A first consideration for sources of error is in 
the addition of reciprocal mobilities. This error is hardly 
significant since the data go as approximately 7—* 
while the acoustical as 7—? and the optical as 7—?. It is 
not likely that a more refined method of calculating 
the resultant mobility will bring it toa T—*/? dependence. 

Since the 7-! law for acoustical scattering has been 
experimentally verified in nonpolar semiconductors and 
metals we do not consider this further. 

Considering next the statistical solution of H-S, 
they show that the variational solution of X(z) at low 
and high temperatures agrees, with the closed form 
solution [Eq. (14) ]. The region of intermediate tem- 
perature is calculated to third order and appears to be 
rapidly converging. It therefore does not seem likely 
that the deviation between theory and experiment can 
be attributed to inadequacies in the solution of the 
Boltzmann equation. 

Going back further one might question the use of a 
single frequency to represent the optical modes of the 
lattice since the reststrahlung data™ does not indicate 


#1 See reference 20, p. 277, Table 14.2. 
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a sharp frequency. However, the approximation that 
all modes are of the same frequency correctly counts 
the number of polar modes, and therefore should give 
the right order of magnitude of scattering when all 
modes are excited, as at high temperatures. 

We thus conclude that the principal source of error 
in the perturbation theory of mobility is most likely to 
be in the expressions for the scattering cross section of 
the electron-polar mode interaction for electron energies 
greater than k@. In view of the closeness of a@ to unity 
it is not surprising that first order expressions for the 
scattering cross sections need revision. 


VII. CONCLUSIONS 


From the comparison of mobility data in PbS crystals 
with theory we conclude that the perturbation theory 
of mobility of Fréhlich-Mott, Howarth-Sondheimer, 
when combined with acoustical scattering, is valid in 
the temperature range 77°K to @, and leads to effective 
masses for electrons and holes in the range of 0.1 m, to 
0.4 m,. In the region T>6 discrepancy exists between 
theory and experiment. It is concluded that the most 
likely source of error lies in the perturbation theory 
expressions for the scattering cross sections of the 
electron-polar mode interaction for electron energies 
greater than k6. 

The polaron theory of mobility is nearly identical 
with the F-M, H-S theory at T7<@ so the foregoing re- 
marks concerning the low-temperature region also apply 
to it. Since the polaron theory is intended for use when 
a1 it may, when completed, agree with experiment 
for T>6. A comparison with the PbS data at high 
temperatures should provide a sensitive test of the 
general polaron theory as well as being of interest in 
the theory of mobility in polar crystals. 
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A theory for the paramagnetic effect is developed under the assumption that the ratio of the effective 
length to the effective diameter //a of the superconducting particles in the transition region is constant. 
While experiment shows that the relative apparent permeability K,, is a function of y= go(1—J,/I), where 
¢0=Hgo/H.0, Ho is the circular and H.» the longitudinal component of the field at the surface, J, is a 
limiting current, and J the total current through the sample, this theory gives the permeability as a function 
of go only. Good agreement with the experimental range of Km, however, is obtained when the theoretical 
value of go is replaced by . The experiments of the author ef al. on solid and hollow mercury cylinders 
and recent experiments of Thompson and Squire on a solid tin cylinder are discussed. A reason why the 
theoretical value of go has to be replaced by y cannot be given at this time, although it is indicated where 


the present theory has to be amended. 





I. INTRODUCTION 


HEN a large direct current is passed longitudi- 
nally through a long cylindrical superconductor 
in the presence of a weak longitudinal magnetic field 
while the temperature is lowered through the transition 
region, Steiner! found that the longitudinal flux inside 
the cylinder may exceed that in the normal-conducting 
state. This flux increase occurs only if the current 
exceeds a certain minimum value. 

It was proven? that this “paramagnetic effect” is due 
to a helical path of the current through the super- 
conductor. Recently Teasdale and Rorschach* and 
Thompson and Squire‘ confirmed the existence of the 
paramagnetic effect. The following points were con- 
sidered in II to develop a working model of the super- 
conductor in this particular state: The temperature 
dependence of the resistance is in qualitative agreement 
with the calculations of London’ on the transition of a 
cylindrical superconductor in which a current is flowing 
(compare Fig. 3 in II and Fig. 41 in reference 5). At 
the point of maximum flux (and we are so far only 
interested in this point of the flux vs temperature curve) 
the total magnetic field at the surface of the super- 
conductor is equal to the critical field H.. According 
to London’s theory the superconductor consists of 
superconducting particles embedded in normal-con- 
ducting material. If only the current is present, then 
the magnetic field has only a circular component 
H,=H, and the shape of the superconducting particles 
could be the one suggested by Shoenberg (see reference 
5, page 120, Fig. 40). If a longitudinal magnetic field 
is now superimposed, H, inside the superconductor will 


* Supported by a grant of the National Science Foundation. 

1K. Steiner and H. Schoeneck, Physik. Z. 38, 887 (1937). K. 
Steiner, Z. Naturforsch. 4a, 271 (1949). 

* Meissner, Schmeissner, and Meissner, Z. Physik 130, 521 
(1951); 130, 529 (1951); 132, 529 (1952) referred to in the text 
as I, II, and III; Phys. Rev. 90, 709 (1953). 

1985) S. Teasdale and H. E. Rorschach, Jr., Phys. Rev. 90, 709 

‘J. C. Thompson and C. F. Squire, Phys. Rev. 96, 287 (1954). 
Vol London, Superfluids (Wiley & Sons, New York, 1950), 

ol. 1, p. 120. 


be larger than zero and the superconducting particles 
will have the shape of “propellers” or of grains oriented 
along a helix. 

In both cases a helical path offers less resistance to 
the current than a straight one. The current will flow 
along a helix and will produce an additional longi- 
tudinal flux inside the superconductor. 

Defining an apparent relative permeability (denoted 
by @ in II): 


1 R 
oe nee f dnrB,(r)dr, (1) 
R’rpoH 20 0 


k= 


[R=radius of superconductor, B,(r)=z-component of 
the macroscopic magnetic induction, H,=external 
magnetic field, wo=permeability of vacuum], it was 
shown that the current necessary to reach Kn=1, ice., 
the value of the normal conductor, is given by: 


I(Kn=1)=1,+720rRH 2», (2) 


where the observed values of J, and y are listed in 
Table I. Solving Eq. (2) for y and noting that J=2rRH yo 
(we use the rationalized mks-system), we find 


H 90 I g 
Y= —( i- *) ; 

Ho I 
We consider now as a variable, which has the values 
listed in Table I at K,,=1. In Fig. 1(a), measurements 


of K,, for mercury are plotted as a function of y (com- 
pare II Fig. 4, where K,, is plotted as a function of I). 


(2') 


TABLE I. Observed values of the characteristics constants J, and 
and ¥ for various superconductors.* 








Transition Limiting current Factor 
temperatures Ig amp Y 
49 r 0.6 0.67 
50 ; . 0.67 
73 . : ? 
80 ; ; 0.37 
81 : I 0.37 


Atomic 


number Element 











8 y is given for R in m, Heo in amp/m. 
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@ H,,* .795 Amp/cm 
4 H,.* 2.38 Amp/em 
® H," 398 Amp/em 
| * H,9*556 Amp/em 








' yen 
(b) 

Fic. 1. (a) Relative permeability Km vs y for different solid 
mercury cylinders and different values of the external field. Note 
that some of the curves coincide, so that 6 of the 11 measured 
curves are very close together. All curves are recalculated from 


the original measurements. (b) Relative permeability Km vs y 
for a solid tin cylinder (from measurements of Thompson and 


Squire). 


In Fig. 1(b), measurements for tin are plotted in the 
same fashion (compare Fig. 4 of reference 3, where 
B—oH. is plotted as function of J). Higher values of 
correspond to higher currents. The plots show that 
there is no obvious dependence of the curves either on 
the diameter of the sample or on the external magnetic 
field, although the curves for mercury scatter very 
much. We conclude therefore that Km depends only on y: 


Kn=Kn(y)=Km[¢o(1—1,/1)], (3) 


where ¢o= Hy0o/H.o. The experimental scattering range 
of y apparently increases with K,, and is very small for 

m= 1. 

It will be shown in the following section that by 
simple additions to the London theory of the current- 
carrying superconductor we can obtain a paramagnetic 
effect. In this theory, however, K» depends on ¢p only: 


Kn=Kn(90). (4) 
Furthermore it will be shown that, although Eq. (4) 


differs from Eq. (3), we get the right scattering range 
if we simply replace y by ¢o. 


Il. THEORY OF THE PARAMAGNETIC EFFECT 
IN SOLID CYLINDERS 


We choose a layer between r and r+dr in our 
superconductor and stretch it out to a plane. The 
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superconducting grains (or the cross sections of the 
superconducting “‘propellers”) will give a pattern indi- 
cated in Fig. 2. We choose the Z-axis parallel to the 
original one and the Y-axis parallel to the former 
¢ direction. The spacing between the superconducting 
particles will be rather uniform, so that the local value 
of the magnetic field is always h=H,. The direction of 
the particles is determined by the magnetic field since 
the force of the electric field is too small to move the 
boundaries appreciably. 

We define the macroscopic magnetic induction B in 
the following way: We choose a plane whose normal 
vector f is parallel to B, so that 


B-f=no { h-dt 


Since |h| =H,, this gives approximately 
B=moénH, with &=d/(a+d), (5) 


where a/d is the ratio of the thickness of the particles 
to the spacing between the particles. H has the magni- 
tude H, and the direction of the mean value of the 
local field h. Since H has both components H, and H,, 
the vector B will make an angle a with the y-axis. 

We choose now a set of axes 7, ¢ such that 7 is 
parallel to B and ¢ is perpendicular to B. 

If we apply an electric field EZ; in the ¢ direction, 
calling ey the local value of the electric field, we find 
for the mean value: 


r= f est, 


or, since éy is fairly constant, 
E; = E11ey. (6) 
If we apply an electric field Z, in the direction, then 


nE,= fear 


and, by the same reasoning as above, 
E,= Eéy, with &=d/(I+d), (7) 


where //d is the ratio of the length of the particles to 
the spacing between particles. 


Fic. 2. Layer of superconducting particles (shaded) oriented it 
the direction of the magnetic induction B. 





PARAMAGNETIC EFFECT IN SUPERCONDUCTORS. I 


For an electric field in any other direction, we suppose 
that we can use with fair accuracy the ordinary transfor- 
mation scheme; this means that we suppose the super- 
position principle to hold for the macroscopic fields 
E, and E;. We get then for instance: 


E,=E, sina+E; cosa, E,=E,cosa—E; sina. 
The current density J is given by 


J r=Onet, J,= Onn, 


where on is the value of the conductivity for the 
normal-conducting material. 

Since Z,= E,=0 we find then for the current density, 
using Eq. (6) and (7): 


= (o1—o11) sina cosa£,, 


(8) 


= (0; sin’a+or cos’a)E,, 
where 


on=on/ én (9) 


are the principal values of the macroscopic anisotropic 
conductivity. If the angle a is zero, the equations reduce 
to London’s equations (reference 5). 

It follows from Eq. (5) that = B/uoH.=d/(a+d), 
which, together with Eq. (7) and (9), gives 


1()(—5)-o(- 5). 


o1=0n/ kt, 


where C=1/a—1 may be still a function of the radius. 
Noting that 


sine=H,/H,., cosea=H,/H., 
we find, using Eq. (10), that 


(11) 


bol. B H,H, 
1,-~c(1- ont 
bol . P 


B 
bol ¢ B\H? 
joao) at, 
B bo c H ? 
For H,=0 these equations reduce again to the appro- 


priate equations given by London. 
We now return to cylindrical coordinates: 


Irs, Iyfe, 


We note that H is not a function of z and that H,=0. 
The expression for curlH then reduces to 


1 0(rH,) oH, 
curl, H = —- , curl,H=0, curl,H=——. 
Vi OF or 


(13) 


We have further 
H7+HZ=H?2. 


The Maxwell equation curlH=J then becomes 


1 0(rH,) 


2) 


r or 


From Eqs. (14) and (15), we obtain 
A.J ,/H,+H,/r=J:. 
Using Eq. (12) and solving for B, we find 
B=wH 0 L#/H,. (16) 


At the surface of the superconductor, r=R, the induc- 
tion B has the value B=yoH, and H,=H,o. Thus 


R=H/onE, (17) 


= oH #H oo/RH ». (18) 


For H.o=0 we have H,o=H, and the equations revert 
to London’s corresponding equations. 
The mean relative permeability K,, is now given by 


and 


“—— (19) 
rR) Heo He 


In order to obtain the dependence of H on 7, we 
have to solve Eq. (15) with the aid of Eqs. (12), (14), 
(17), and (18). We obtain 


dH, Hy, Heo 

dr — (-5 
where C=//a—1 may be a function of r. Similarly, 
we obtain 
dH, (——= ——— 


dr r R 


(20) 





a 
Since the superconducting particles have always the 
same field H, around them, we will now assume that C 
is constant, independent of r. 

Then as r—0 Eq. (20) becomes: 


_ (tHe H3(0) 
in 2) —c0( 1-22), 


It can be shown that this equation has only the follow- 
ing, physically significant, solutions: Either H,(0)=0 
or H,(0)=H,. We know from the calculations of F. 
London that the latter solution holds for the case 
H.»=0. One concludes that even a small field H.o 
changes the pattern of the superconducting particles 
entirely, so that now H,(0)=0 and H.(0)=H 

For numerical calculations we put Eqs. (20) and 
(21) in a dimensionless form. With 


¢=H,/H.; x= H./Hw; 
H./Hw= (1+ go ) 4, 


p=1/R; (22) 





HANS 














Fic. 3. Circular (upper figure) and longitudinal (lower figure) 
component of the magnetic field vs radius for different values of 
yo. go is the circular component of the field at the surface of the 
sample. It is proportional to the total current. All curves for 
C=10. Thin lines: g= gopC/(C—1). 


we obtain: 


ad o)(1-25) 
= 0 ) 


dp p (1+ yo*)# 
XxL1—x?/(1+ go?) J}, 





(21’) 


. igh 
K,=- J —px2mpdp. 
mm) © 


(19’) 


R,, is a function of go only. This is exactly what we 
have stated in Eq. (4). 
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Equations (20’) and (21’) cannot be solved in 
analytical form. We have solved one of them by 
numerical methods for C=10. The result is plotted in 
Fig. 3. 

It can be shown that 


e=[C/(C—1)]eu (23) 
is a good approximation for y. The approximation 
improves as C increases, or more specifically, if 

(1+ go?) / go?>>C?/(C—1)?. 


We will use Eq. (23) for the limiting case C=», 
which will not differ very much from the case C= 100, 
From the curves in Fig. 3 we can now calculate the 
local value of Km: 

Km=px¢0/ ¢. 


The result is plotted in Fig. 4(a). It can be seen that 
most of the flux increase occurs near the center of the 
sample. Integrating over r according to Eq. (19) gives 


Km 
2.6: 
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Fic. 4. (a) Local values of the relative permeability Km 
radius for different values of go. All curves for C=10. (b) Mean 
values of the relative permeability Km. Broken curves: theoretical 
values of Km as function of go for C=10 and C= ~. Solid line: 
experimental values of Kn as function of y for tin (from measure 
ments of Thompson and Squire). Horizontal lines: experimental 
scattering range of Km as function of + for mercury. 
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the mean value of Km. This is plotted as function of 
yo in Fig. 4(b). (Broken curve C= 10.) 

With Eq. (23) we can immediately derive an ana- 
lytical expression for Km(¢o) : 


ae 1\*/ 1+ ¢0\3 
tai MICE2) 
3 G vo 
1+ 90° 1 24! 
-| ad ice ' (24) 


We used Eq. (24) in order to calculate the curve for 
C= and the part below go=0.8 of the curve for 
C=10, where Eq. (23) is a good approximation. 

We now replace go by y= ¢0(1—J,/J) and indicate 
according to Fig. 1(a) the scattering range of the 
measurements on mercury by horizontal lines at Kn=1, 
Kn=1.5, and K,=2.'We see that this range lies 
approximately between the curve for C=10 and the 
curve for C= although we would expect a much 
larger scattering range for K,=1. The scattering arises 
apparently from a variation in C at different runs. The 
solid curve represents the measurements of Thompson 
and Squire on tin according to Fig. 1(b). Since they 
had only one sample of very high purity, which did not 
melt between different runs as the mercury did, they 
observed no scattering. 

Summarizing the results on solid cylinders, we can 
say that although we are unable to show that Kn 
depends on y rather than only on go, the numerical 
values of Km which we expect for a certain current J 
are rather good. 


III. THEORY OF THE PARAMAGNETIC EFFECT 
IN HOLLOW CYLINDERS 


In III, measurements on hollow, current-carrying 
cylinders in external fields were reported. However, 
before we discuss the case of current and external field, 
let us first see what we would expect if only the current 
I, flows through the sample. 

As in London’s theory for the solid cylinder, H,=H,. 
The question now arises as to how the field can be 
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Fic. 6. Increase of the maximum circular flux in the inter- 
mediate state relative to the flux in the normal-conducting state 
as function of p;=R;/R, (theoretical). 


equal to H, near the inner surface r=R;. Apparently, 
the conical superconducting rings now spread out to a 
thin layer as indicated in Fig. 5. The limiting current 
which can pass through a thin layer is smaller than 
that, which would be calculated from the value of H, 
for the bulk material. In other words: For a thin, 
current-carrying superconducting layer, H, is smaller 
than for the bulk material. The thickness of this layer 
will be of the order of 10~‘ cm. 

For the bulk of the material, B= B, has the same 
value as for the solid cylinder: 


Bo=poH #/R. (25) 


Since we can neglect the thin layer, the maximum 
circular flux per unit length is given by 


Ro 
de nam f B,dr=poH (Re—R?)/2Ro. (26) 


The field distribution in the normal-conducting state is 
given by 


Ro R? 


H gn= (27) 


and the flux per unit length by 


acs xe RA)—R2In~]. (28) 
don= vo g«s—_J 3 (Ro’— R?) — R; n— | 28 
¢? ¢ (Re’—R2) 2 R; 
Since the total current in the normal-conducting and in 
the intermediate state is the same, it follows that 
A oo= He. 

We define now 


3 (R o— R?)? 
R&[3(RE—R?)—R?21n (Ro/R:)] 
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Fic. 7. Field in the hole of hollow cylinders as function of the 
circular field at the surface of the cylinder for p;= R;/R,=0.60. 
Solid lines: experimental curves for mercury samples V and VI. 
Dots: theoretical values for C=10. Dashed curve: theoretical 
values for C= «. Dot-dashed curve: x= (1+ ¢0)# that corresponds 
to H.(pi,¢0)= He. 


or in dimensionless form with p;=R;/Ro: 


" (1—p,?)? 
K 


"* (1—p2)+p2In p? 





(29) 


It follows from Eq. (29) that the flux in the intermediate 
state is always greater than in the normal state: 
2> Kmy> 1. Kmy depends only on the ratio of the inner 
to the outer diameter and is independent of the current. 
In Fig. 6, we have plotted Ky vs pi. 

This effect has been reported in the literature® 
although the authors finally doubted its existence 
(1939). 

Superimposing an additional field H, further increases 
the flux, which Steiner! called the transverse para- 
magnetic effect. 

If we superimpose now an external field H.o, we 
again get the splitting up of the conical rings. Measure- 
ments of the increase of the longitudinal flux in the 
hole and over the whole cross section have been re- 
ported (III, Fig. 6). The two mercury samples had the 
outer radii: sample V: R,=0.4375 cm, sample VI: 
R,=0.810 cm. The ratio of the inner to the outer 
radius was the same in both cases: ps>= R;/R,=0.60. 

We assume, now, that we have in the bulk material 
the same distribution of current and field as in a solid 
cylinder and that we have a thin current layer at the 
inner surface which brings the field H, up to the 
necessary value. 

The field H, on both sides of this layer will be about 
the same; i.e., it will have approximately the value 
which it has at p=p; in a solid cylinder. 

6K. Steiner, Physik. Z. 38, 880 (1937); Stark, Steiner, and 


Schoeneck, Physik. Z. 38, 887 (1937) ; K. Steiner and H. Schoeneck, 
Physik. Z. 40, 43 (1939). 
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We now find that the curves of the measurements 
in the hole come very close together if we plot x(p;,¢0) 
= H,(p:,¢0)/H20 as a function of go=Hyo/Hz. This is 
done in Fig. 7 (solid curves). There is a slight change 
of the curves with Ho, indicating that the layer still 
contributes somewhat to the field H,. From Eq. (14) 
it follows that 


x (0,90) < (1+ ¢o*)!, (30) 


which is generally observed by the measurements. One 
case, where x is slightly above the curve x= (1+ ¢,°)! 
(the dot-dashed curve) is probably due to a slightly 
wrong scaling factor in the measurements of x. If 
x= (1+ yo?)!, then the layer current is zero. 

We can now obtain x(p:,g0) for C=10 from our 
curves in Fig. 3 (dots). If we assume Eq. (23) to hold 
and choose C= ©, then x(9;,¢0) is given by 


x (pi, G0) = [1+ go?(1—p?) J}. (31) 


This curve is also plotted in Fig. 7 (broken curve). We 
see that the scattering which we expect on account of 
a change in C is relatively small. The fact that the 
curves for higher H,9 values are above the C= © curve 
is, as already mentioned, probably due to a contribution 
of the layer current to H.. 

The asymptote of x(;,¢0) in Eq. (31) is given by 


Xasympt— go(1 —p?)} 


which intersects the go-axis at go>=0. The asymptotes 
of the measured curves seem to intersect the go-axis at 
¢yo=0.25. However, the measurements are not carried 
far enough to tell whether this difference is real. 

These measurements give one very significant indi- 
cation: x is apparently a function of go rather than of y. 
This means our differential equations Eqs. (20) and 
(21) are probably right and the change from Eq. (4) 
to Eq. (3) arises out of a change in the expression for B 
Eq. (18). 

The measurements of the flux through the whole 
cross section of hollow cylinders as functions of the 
total current have little meaning. The more important 
quantity is the flux through the ring section. Further- 
more we want to plot the flux, or Km, versus y rather 
than versus I. 

The first conversion is easy to make. We have: 


R?’K m(whole sample) — R?x (hole) 
Re—R? , 





(32) 


m ring— 


According to Eq. (2), y is given by 


v= 0(1—T,/1); 
this means by 


= ¢o (correction factor). 


We easily find go from the known values of Ho and 
H,.». However, there is some ambiguity as to what 
value we have to use for this correction factor. Without 
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special reasoning we will use the following: J,=1.7 
amp as for solid mercury cylinders; J=Jpu1, the bulk 
current, that is the total current minus the layer 
current. The layer current is given by 


Niayer= 2rR:H (pi, 0), 


where H,(p:,¢0) is the value just inside the current 
layer and may be calculated from 


A, (0:, 90) = Aol 1+ ¢o’—x"(hole) ]}. 


y is now entirely given by known parameters of the 
measurements. In Fig. 8, the relative permeability for 
the ring section is plotted vs y for the two mercury 
samples V and VI and external field values of H.o=0.4 
amp/cm and H,=1.6 amp/cm. As for the measure- 
ments in the hole of the cylinder, there seems to be a 
slight dependence on H,0, which may be attributed to 
the layer current. 

In the same way as we did for the solid cylinder, we 
now identify our theoretical value go with y. 

Integrating in Fig. 4(a) the local Km(p) between 
p=p; and p=1, we get Km ring(pi,¢0) for C=10 (lower 
broken curve in Fig. 8). For the limiting case C= ~, 
we use Eq. (23) and find, in the same fashion as for the 
solid cylinder, by integrating, this time from p=p; to 
p=1, 


Kon ring=4{[1+ 90?(1—p?) }}—1}/¢0%(1—p2). (37) 


This curve also is plotted in Fig. 8 (upper broken 
curve). 

We see that the curves for C=10 and C= give 
about the right scattering range for the measured 
curves. 
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Fic. 8. Increase of the longitudinal flux in the ring section of 
hollow cylinders. Theoretical curves as function of go. Experi- 
mental curves as function of y. 








VI. CONCLUSIONS 


Although we did not end up exactly with the experi- 
mental result, it seemed worth while to demonstrate 
what an extension of the ordinary theory of the current- 
carrying superconductor would give for the “para- 
magnetic effect.” In this extension essentially only one 
new assumption was made: It was assumed that the 
shape of the superconducting particles is independent 
of the radius. This was made plausible by pointing out 
that the local field around the particles is always the 
critical field. 

It seems that essentially new concepts are necessary 
in order to bring the theory into exact agreement with 
the experiment. Only these new concepts will bring the 
explanation of the “magic constants” /,. 

The author wishes to thank the National Science 
Foundation for supporting this work by a grant. 
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_ Superconductivity has been found in three samples of uranium. The transition is broad (0.2 to 0.3°K 
wide), and is centered at 0.77°K for two samples and at 0.80°K for the third. 





HERE have been a number of conflicting reports 

on the question of the temperature at which 
uranium becomes a superconductor. Aschermann and 
Justi,! by means of resistance measurements, found that 
uranium became superconducting at 1.25°K. Shoen- 
berg,’ in an examination of the magnetic properties, 
found only a slight diamagnetism which began at 
1.45°K and corresponded to about 0.5 percent by 
volume of the sample becoming superconducting. 
His measurements extended down to 1.08°K. Shoenberg 
states that his sample was very pure and that the effect 
Justi observed may have been due to an impurity 
becoming a superconductor. Alekseyevsky and 
Migunov’ also investigated uranium by the magnetic- 
ballistic method. They report that three very pure 
specimens became superconductors at 1.3°K while 
other less pure samples failed to become superconduc- 
tors, presumably down to the lowest temperature 
attained, 0.06°K. They attribute Shoenberg’s negative 
results to lack of purity of his sample. Later, Goodman 
and Shoenberg‘ reported that of a number of specimens 
of uranium tested, one did become superconducting 
at about 1.3°K and that this specimen was less pure 
than others that failed to show a transition above 1°K. 
In a later series of measurements, two of the samples 
that failed to show a transition above 1°K were cooled 
further and did become superconductors at about 
0.75°K. Both the 1.3°K and the 0.75°K transitions 
were about 0.3°K wide. 


TaBLe I. Analyses of samples, in parts per million. 








No. 10 No. 11 
<20 
500 
95 90 
40 35 
125 75 


Sample No. 12 





none detected 
100 








*Work performed under the auspices of the U. S. Atomic 
Energy Commission. 
{ Department of Chemistry, The Rice Institute, Houston, 
Texas. 
{Department of Physics, University of Illinois, Urbana, 
Illinois. 
1G. Aschermann and E. Justi, Physik. Z. 43, 207 (1942). 
2D. Shoenberg, Nature 159, 303 (1947). 
( 3 i) Alekseyevsky and L. Migunov, J. Phys. (U.S.S.R.) 11, 95 
1947). 
4B. B. Goodman and D. Shoenberg, Nature 165, 441 (1950). 


The following is an account of some magnetic 
measurements on uranium made at this Laboratory 
during 1951-52. The samples were in the form of 
cylinders, 0.250 inch in diameter and 1.250 inch long. 
The search coil surroudning the sample had 20000 
turns of No. 44 wire. The field was supplied by a 
Helmholtz coil. The susceptibility of the sample was 
observed by the mutual inductance-ballistic deflection 
method. The cryostat (Fig. 1) consisted of an outer 
liquid air Dewar, a middle helium Dewar and an inner 
helium Dewar whose volume was about 200 ml. The 
inner Dewar could be pumped on by a 500-1/sec oil diffu- 
sion pump and a large Kinney pump. The spiral radia- 
tion shield was made of copper, blackened on the inside, 
and was itself cooled to about 1.5°K by pumping on the 
liquid helium in the middle Dewar. The inner Dewar 
was soldered to the radiation shield with low-melting 
indium-tin solder. The contents of the inner Dewar 
could be held for long periods at constant temperature, 
over the range 4°K to 0.77°K. Without the heat leak 
resulting from the lead wires to the search coil it is 
probable that the inner Dewar temperature could be 
reduced several hundredths of a degree further. At 
pressures above 1 mm, pressure-temperature control 
was achieved by means of an automatically regulated 
bypass. In the lower temperature range, a butterfly 
valve in the six-inch pumping line was used to throttle 
the diffusion pump. The temperature of the sample was 
calculated from the corrected pressure, as measured 
on a series of pressure gauges, manometers and a 
McLeod gauge. This system was checked several times 
against the magnetic scale with fresh potassium chrome 
alum in the sample space. Good agreement was found. 

The design of the cryostat was evolved over a period 
of several years. In the arrangement used in the 1951 
measurements, the temperature could be reduced to 
0.82°K. A number of modifications were then made, 
principally in the design and helium jacketing of the 
radiation shield, so that in the 1952 measurements, 
temperatures as low as 0.76°K were occasionally 
reached. 

Three samples of uranium were investigated. Samples 
10 and 12 were in the as-cast condition; Sample 11 
was stock uranium. Sample 11 contained the most car- 
bon and iron, and Sample 10 the most sulfur. The other 
differences were minor. The total of impurities, other 
than those given in Table I, was less than 0.01 percent. 
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All three samples showed an increase in diamagnetic 
susceptibility beginning at about 0.9°K. In no case 
was it possible to cool the sample all the way through 
the superconducting transition. Measurements on a 
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Fic. 1. Cross section of cryostat. 


sample of pure tin indicated that the sensitivity of the 
apparatus was such that a completely superconducting 
sample would give a ballistic deflection of almost 
exactly 100 cm. 

Sample 10 was cooled almost half way through the 
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Fic. 2. Ballistics deflection vs temperature. O Sample 11, 1951; 
@ Sample 11, 1952; V7 Sample 10, 1951; w Sample 10, 1952. 


superconducting transition, which in this case 
apparently centers at 0.77°K. The deflection curve for 
Sample 11 is very similar to that for Sample 10 except 
that it is displaced about 0.03°K higher in temperature. 
Sample 12 behaved almost indistinguishably from 
Sample 10. 

In Fig. 2 the results of two runs each on Samples 10 
and 11 are shown. Unfortunately, although Sample 11 
became superconducting at a somewhat high tempera- 
ture than Sample 10, the apparatus was not in quite 
as good condition as usual on the two Sample 11 runs 
and therefore not quite as low temperature as usual 
was attained. 

There seems to be no doubt now that various samples 
of uranium become superconductors at various tempera- 
tures, but the reasons for this behavior are still some- 
what obscure. Apparently the transition temperature 
is insensitive to small amounts of sulfur, but may de- 
pend upon the presence of either carbon or iron, most 
probably the former since the change in iron concen- 
tration by more than a factor of two between samples 
10 and 12 resulted in no displacement of the transition 
temperature. Alternatively the metallurgical history of 
the specimen may be a more important factor than 
previously suspected. We cannot accept the explanation 
that the purer the sample, the higher the temperature 
of the transition. 
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A general method is outlined for determining the number of 
vacant lattice sites or interstitial atoms in a monatomic solid 
exposed to neutron radiation. The colliding atoms are assumed to 
be within the energy range for which the orbital picture can be 
applied. Following the treatment of Bohr, the scattering regions 
of excessive and moderate screening, Rutherford distribution, and 
electronic collisions are considered separately. The number of 
vacancies or interstitial atoms as a function of the energy of the 
primary knocked-out atom is given by the solution of certain 
integral equations that are different for various energy regions 
considered. It is found that if the velocity of a recoil atom resulting 
from neutron collision is less than e*/# (region of elastic collisions) 
approximately half of its energy is used up to produce vacancies 


or interstitials. If the velocity of the recoil atom is above e?/h 
(region of inelastic collisions) then the energy used up to produce 
vacancies and interstitials is approximately constant for medium 
and heavy elements. A simple formula has been derived expressing 
the average number of vacant lattice sites or interstitials produced 
in a collision of a neutron having energy E in a monatomic solid 
composed of medium or heavy elements having atomic mass VM. 
The formula is as follows: 


G(E)~(nE—«a)?/4anE for ESy/n, 
G(E)~[(nE—a)*— (1—R) (nE—a—y)*]/4anE for EZ/n, 


where y= Me'/2h?; n=4M/(M+1)?, a is the binding energy of 
an atom in the lattice, and & is a slowly varying function of Z. 





I. INTRODUCTION 


EAVY corpuscular radiations such as neutrons or 
ionizing particles which enter a solid dissipate a 
portion of their energy in close encounters with the 
constituent atoms of the solid and eject some of them 
irreversibly from their normal positions thus producing 
vacant lattice sites and interstitial atoms which we shall 
designate as “displacements” and “displaced atoms.” 
The properties of the solid change with the number of 
the displaced atoms and it is the purpose of the present 
investigation to determine the number of such dis- 
placements. 

Wigner was the first to call attention to these phenom- 
ena and the earlier treatments of this subject are based 
largely on the pioneer work of Seitz’ and utilize the 
Born approximation for determining the elastic col- 
lisions of atoms. The Born approximation is, however, 
applicable in a range of velocities considerably higher 
than those encountered in this problem. We are, there- 
fore, applying classical considerations based on an 
extended study of the subject made by Bohr.’ Also, we 
give a more detailed analysis of the cumulative processes 
leading to the atomic displacements. 

If the atom knocked out from its normal lattice posi- 
tion has acquired a relatively high velocity, it will lose 
much of its energy by colliding with the individual elec- 
trons and thereby excite and ionize other atoms in the 
solid. These processes are designated as “inelastic col- 
lisions.” As the atom slows down, the relative amount 
of energy lost by inelastic collisions decreases and most 
of the energy loss is due to direct hits on other atoms 
in the solid. The latter process is designated as an 


1 F, Seitz, Discussions Faraday Soc. 5, 271 (1949). 
2.N. Bohr, Kgl. Danske Videnskab. Selskab, Mat.-fys. Medd. 
18, No. 8 (1948). 


“elastic collision” and is effective in producing atomic 
displacements. 

We consider a monatomic solid composed of atoms 
having atomic number Z. Following Bohr, we shall use 
a simplified picture assuming that the stopping of a 
knocked-out atom having energy x is due almost en- 
tirely to inelastic collision if 


w>1 or #s>y=Me/2R. (1) 


w designates the velocity of the atom and is measured 
in “atomic units,” i.e., 


o= v/, (2) 


where % is the “‘velocity” of the electron in the hydrogen 
orbit: 
vo= e?/h=2.18X 10° cm/sec. (3) 


We shall also assume that the stopping is entirely due 
to elastic collisions if 


w<1 or x<y. . (4) 


Il. ELASTIC COLLISIONS 
A. Formulation of the Problem 


The mechanism of collisions is assumed to be the 
same as in the previous treatments and is based on the 
existence of a binding energy a of the lattice atoms. In 
determining the energy distribution of the struck atoms, 
we use the cross section for collisions with free atoms. 
If the energy acquired by this atom as a result of a 
collision exceeds a, we assume that it leaves its cell in 
the crystal and produces a permanent displacement. If, 
however, this energy is below a, then we assume that 
the lattice acquires vibrational energy which becomes 
eventually dissipated without producing any permanent 
changes. The value of a@ is not accurately determined 
and has been taken in the past to be of the order of 25 ev. 
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Each atom knocked out of the lattice as a result of 
a collision gradually loses its energy in secondary col- 
lisions, thus generating secondary particles. Each 
secondary particle thus released moves through the 
lattice and releases by the same mechanism tertiary 
particles. This process may continue for several genera- 
tions until the energy of the particles released after 
several stages is insufficient to knock out any additional 
particles of the lattice and is dissipated in the form of 
heat. 

Assume that as a result of the mechanism described 
above, a primary knocked-out atom having energy x 
produces g(x)—1 displacements in all successive stages, 
i.e., the total number of displacements is g(x) including 
the primary atom. If the atoms in the solid were free 
(i.e., if we neglect their binding energy) then the energy 
y acquired by each atom would be used entirely to 
produce further collisions. Since the atoms are not free, 
a portion of the acquired energy is used to free the 
atom from its bond and the remainder (y—a) is the 
kinetic energy that is effective in producing further 
collisions. Thus 


(S) 


Let K (x,y) be the probability that the primary atom 
loses energy in dy about y in an elastic collision. We have 


K (x,y) = o(x,y)/o(x), (6) 


where o(x,y)dy is the differential cross section for an 
atom of energy x to lose an amount of energy in dy about 
y in an elastic collision and o(x) is the corresponding 
total cross section for an elastic collision. 

Assume that the struck atom gets energy in dy at y 
and then the primary atom has energy x—y. If y2a, 
then the number of displacements is g(x—y)+g(y—a), 
but if y<a then the struck atom is not displaced and 
the number is g(x—¥). 

Thus for x>a, g(x) is a solution of the integral equa- 
tion 


g(x)=1 for x<a. 


g(x) = i) g(x—y)K (x,y)dy+ f g(y—a)K(x,y)dy. (7) 


If we define g(x) =0 for x0, then (7) is satisfied also 
for OS xSa. 


B. Determination of the Kernel 
1. General 


We are dealing with a collision of two identical atoms 
which, in the center-of-mass coordinates, is represented 
as an interaction of a heavy particle having mass 


My=M/2, (8) 
with a screened field characterized by 


V=(Ze/r) exp(—r/a), (9) 


BY NEUTRON RADIATION 


with the screening parameter 
a=h?/(V2meZ}). (10) 


We shall apply the orbital picture to the study of the 
collisions of particles in the field (9). Since previous 
investigations utilized the Born approximation it 
appears to be desirable to consider the range of validity 
of the two methods and their possible applicability to 
this problem. 


2. Region of Validity of the Born Approximation 
and of the Orbital Picture 


A convenient criterion for the validity of the Born 
approximation is that the scattered field is small when 
compared to the incident field at the source. When 
applied to the screened Coulomb field, this criterion 
can be expressed as follows: 


Le Mova 
( )«t : 
hv h 


The inequality (11) should be satisfied for velocities 
w>>2v2Z4m/M, which are those encountered in the 
present problem. 

Substituting (2) and (3) in (11) and taking into 
account the fact that the term under the logarithm in 
(11) is considerably larger than one, we obtain 


w>Z?. (12) 


Thus, the Born approximation is not applicable to 
medium and heavy elements and for the light elements 
its applicability is limited to a relatively high-energy 
region. 

The criterion for the validity of the classical picture 
has been established by Williams‘ and can be expressed 
as follows: (a) the wavelength of the moving particle 
is very small when compared to the screening parameter 
and (6) the uncertainty in the momentum of the particle 
is much less than the disturbance caused by the de- 
flection in the field V. 

The assumption (a) leads to the following inequality: 


(h/a)KM wv. (13) 


Taking into account (2), (3), (8), and (10), we can 
express (13) as follows: 


w>>2v2Z4m/M. 


This inequality is satisfied in all practical cases. 
The assumption (5) leads to the following inequality : 


Vr/ho>1. (15) 

Substituting (2), (3), and (9), in (15), we obtain 
(Z?/w) exp(—r/a)>1. 

"aL, I. Schiff, Quantum M ~~ (McGraw-Hill Book Com-. 


pany, Inc., New York, 1949), p. 
‘E. J. Williams, Revs. Medora Phys. 17, 217 (1945). 


(11) 


(14) 


(16) 
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It is noted that for a given value of Z and w the 
condition (16) can never be satisfied for the whole 
region of space and, consequently, the orbital picture 
cannot be generally applied. We may associate with 
each value of w a radius 


R=aIn(Z?/w), (17) 


such that only for rR is the orbital picture valid. If, 
however, the region defined by the radius R is suffi- 
ciently large so as to include most of the space occupied 
by the scattering potential, i.e., if 


R>a or In(Z*/w)>41, (18) 


we may assume that the orbital picture applies to the 
whole space. 

Since in the present problem wXZ*, we are dealing 
with a problem in which the orbital picture is valid. 


3. Scattering of Particles in the Screened Coulomb Field 


A complete study of the scattering of particles in a 
screened Coulomb field has been given by Bohr. Bohr 
divides the space in which the scattering occurs into 
regions defined by a screening parameter® 


f=b/a, 
where b designates the collision diameter, i.e., 
b=227e/M or". (20) 


Taking into account (8), (19), (20) and putting 
x= M?/2, we obtain 
x= 22e/ta. (21) 


The character of the problem depends essentially 
upon the value of the screening parameter ¢. Following 
Bohr, we shall simplify the picture by assuming that, 
for relatively slow particles such that 


f>1 or «<p=22/a, (22) 


the nuclei of the two colliding atoms will not penetrate 
substantially within each others electronic shells and 
the scattering has a spherically symmetrical angular 
distribution in the center-of-mass coordinates. We shall 
designate this region as the “region of isotropic scat- 
tering.” 

If, however, the particle has a relatively high energy, 
such that 


(19) 


¢<1 or x>8, (23) 


we assume that the nuclei of the colliding atoms will 
penetrate substantially within each others electronic 
shells and the scattering in the center-of-mass coor- 
dinates will conform over a considerable angular interval 
with the Rutherford law. This region shall be designated 
as the “region of Rutherford scattering.” 

(a) Region of isotropic scattering (x<B).—We have’ 


o(x,y)=o(x)/x. (24) 


5 Reference 2, p. 20. 
® Reference 2, p. 49, Eq. (2.2.8). 
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The total cross section o(x) depends upon the effective- 
ness of screening. For very low velocity in the region 
of excessive screening (¢>>1) it is of the order of mag- 
nitude of the gas kinetic cross section and for higher 
velocities in the region of moderate screening ({~1) it 
has the form o(x)~zta?/e, where «= 2.74. Taking into 
account (6) and (22), we have 


K(x,y)=1/x for 0<y<x<Q. (25) 


(b) Region of Rutherford scattering.—Bohr has re- 
placed the screened field by a Coulomb field confined 
within a sphere of radius a. Consequently, 


(26) 
The cutoff introduced by the maximum impact param- 


eter equal to a is equivalent to an angular cutoff Onin 
defined by’ 


o(x) =a’. 


tan (6@min/2) =b/2a. 
Substituting (20) and (22) in (27) we obtain 
Sin*Omin = 8"/ (447+). ° (28) 


Consequently, the amount of energy y lost by the par- 
ticle during the collision is comprised within the energy 
range 


(27) 


Ymin <y<x, (29) 


where 


Vmin = % sin’Omin = 18"/ (422+--6?). (30) 


The cutoff value associated with an energy x will be 
denoted by 2; in all that follows. Thus 


x= xB"/(6°+-42’). (31) 


The solution of our problem is found to depend, in part, 
on the ratio of x; toa. 

From (22) and (10), we have (assuming that a= 25 
ev), 


B 2v2me! 
-= Z+4-~3.078Z7+4> 39.95 for Z23; (32) 
a ha 

and from (1) we have that 
y/a=Met/2h’a~2X10°Z 26X10? for Z23. (33) 


It is clear that x; is a decreasing function of x and we 


have 
B? 1602) # 
wi=a for r=_|1+(1-—) | (34 
8a e 


We shall denote this value by §’. Similarly, for 


B° 640") 4 
r=p"=—|1+(1-—) | 
16a B? 


the cutoff value is (@’’):=2a. 


(35 


7 Reference 2, p. 6, Eq. (1.1.3). 
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TaBLe I. Numerical values of the parameters, for a=25 ev. 








B"/a 


10280 
5006 
2137 

752.9 
195.1 


B'/a vi/a 
20560 
10020 
4276 
1509 

393.3 


y/a 





1.489 
0.8463 
0.4008 
0.1697 
0.05762 


1.381 X 104 
1.183 X 104 
1.067 X 104 
8.893 X 108 
6.844 10° 








For the convenience of the reader, the following 
table of numerical values is given (Table I). In cal- 
culating these values, it has been assumed that a= 25 ev. 
It is easily seen that these quantities are increasing 
functions of Z. Since® 


o(x,y)=aZtet/xy?, 
we obtain, taking into account (6), (26), and (36), 
K (x,y) =/4ay ‘for x?/(42°+6) <y<z, 
K(x,y)=0 for y<x6?/(402+6?). 


(36) 


(37) 


C. Determination of the Number of Displacements 


We shall proceed now to determine g(x) from (7) in 
the region of isotropic scattering and Rutherford scat- 
tering. 

Substituting K(x,y) as defined by (25) and (37) in 
(7), we obtain 


| z 
g(x) =1+- f dug(u) fora<x<2a, (38) 
sl 


2a 1 1 sf74 
gla)=—+— ff dug(w)+- f dug(u) 
x LYM, XVMa 


for 2a<x<B, (39) 


z dtp? 


iB? 
salina —(t—a) 


zy 4X [a,xi]> x 


g(x) = 


for B<x<y, (40) 


where x; is defined by (31) and [a,x: ] designates the 
larger of values a and 2}. 

As shown in the Appendix, the solution of (38), (39) 
and (40) is as follows: 

For aSx3S8, we have 


A(x+a)/2aS g(x) S B(x+a)/2a, 


with A=1 and B=8/7. 

For BSxS[6',y]<, where [@’,;y]< indicates the 
smaller of 6’ and y, A=1 is also valid and 8’>7 if Z>6. 
For Z<6 and 6’ SxSv, A may be taken from the table: 


(41) 








3 4 3 
0.9627 0.8770 0.7542 








® Reference 2, p. 42, Eq. (2.2.2). 
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For BSxS[8" vy ]<, we may take B=1.15 and for 
Zsi7, B">y. If ZS7 and B’SxSvy, then we take 
B= (8/7)+0.07. (See Appendix 109a.) 
On the basis of the above results, we may assume 
that in the region of elastic collisions, i.e., for 0<x*<y, 
the following relation is approximately correct 


g(x) =2/2a; (42) 


ie., approximately half of the energy of a recoil atom 
is used to produce displacements. 


Ill. INELASTIC COLLISIONS 
A. General 


We are dealing here with a region in which the 
velocity of the moving particle is less than the orbital 
electron velocities in the solid and at the present time 
there is no exact theory to evaluate the excitation and 
ionization losses in this region. Our computation of 
energy losses will be based on certain assumptions made 
by Bohr that are applicable to intermediate and heavy 
elements. 

In the case of lighter elements it is necessary to 
compute separately the effectiveness of each electronic 
orbit as done by Livingston and Bethe,’ Hirschfelder 
and Magee,” and Neufeld.” 


B. Calculation of the Energy Loss 


According to Bohr the rate of energy loss of a particle 
in the above range moving through a medium composed 
of intermediate or heavy elements is as follows”: 


dx/dz=NBon.(3[k}>*#+[k}), (43) 


where z is the length of the particle track, N is the 
number of atoms per cm* of the solid, 
2n(Z*)e4 20 2Ze 
= a Zt ; 


aS 


mv Vo hv 


[k]=« forx>1 and [k]=1 forxc<1. (44) 


Here Z* represents the charge of the moving particle 
and can be assumed to vary with the velocity as follows: 


Z* =Z}0/ 0. (45) 


We take x>1. From this it follows that w<2Z and, 
consequently, the formula (43) is valid for y<x<4Z"y. 
Substituting (44) and (45) in (43), we obtain 


dx/dz=C/x, 
4nv2N Zh? 


C=———[3(228)-1+-2(Z!)]. 
M'me? 


9 M. S. Livingston and H. A. Bethe, Revs. Modern Phys. 9, 263 


(1937). 
10 J. O. Hirschfelder and J. L. M a7 % , Phys. Rev. 73, 207 (1948). 
11 J. Neufeld, Proc. Phys. Soc. A66, 590 (1953) 


12 Reference 2, p. 102, Eq. (3.5.7). 


(46) 
where 


(47) 
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VELOCITY OF THE MOVING COPPER ATOM IN UNITS (5) 
3x10712 


" 
= 


qT Sie v /atomscm®) 


txio72 


LOG X 
ENERGY OF THE MOVING COPPER ATOM IN (e v) 


Fic. 1. Stopping power of copper ions in copper. 


In view of the complete lack of experimental evi- 
dence, we are not able to verify the results expressed by 
(46). Some approximate estimates show, however, that 
this expression gives at least the right order of magni- 
tude for the energy loss. In Fig. 1, the line FG is based 
on (46) and represents the stopping power of recoil 
atoms in copper. The line HJ shows a rough estimate 
of the stopping power which has been obtained by 
assuming 

1 dx iz - 

nasal 0, 

ron (48) 
where o designates the “‘specific electronic cross section” 
and (Z*),, is the average of the square of the moving 
charge.” Using the experimental data of Warshaw" 
giving (1/N)(dx/dz) for protons in copper, and the 
experimental data of Hall'® for (Z,) for protons, we 
derived from (48) the specific electronic cross section ¢ 
for copper corresponding to various velocities of the 
moving ion. The stopping power due to inelastic col- 
lisions has been calculated by using 


1 dx 
Fe (2*)°o, (49) 


where Z* for copper has been determined from (45). It 


ud Knipp and E. Teller, Phys. Rev. 59, 659 (1941). 
S. D. Warshaw, Phys. Rev. 76, 1759 (1949). 
wT. Hall, Phys. Rev. 79, 504 (1950). 
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is noted that the curve HI gives values of the same 
order of magnitude than those shown on the curve FG. 
The assumptions leading to the curve HI have been 
very rough and, therefore, the above calculations are 
not considered as a verification of the formula (43). 
They do indicate, however, that this formula is of the 
right order of magnitude. 


C. Determination of the Number of Displacements 


We consider here a particle having energy x>y. 
While slowing down the particle loses its energy by 
inelastic collisions in accordance with (43) and also 
participates in elastic collisions in accordance with (37). 
Only the latter process is effective in producing lattice 
holes and interstitial atoms. 

The probability that the particle has traversed a 
distance z without suffering an elastic collision is 
exp(— N2a’z), where ra? is the scattering cross section. 

Thus, the probability that the striking atom has its 
first elastic collision while traversing an element of path 
length dz after going a distance z along its path without 
an elastic collision is 


exp(— Naz) Nxa*dz. 
Using (46), we have that 
z= (2/C)(«!—#), (51) 


(S0) 


and 
dz= —dt/Ct}. (52) 


Substituting (51) and (52) in (50), the probability 
that the striking atom has its first elastic collision in 
an energy interval dt about ¢ is given by 


exp 


Once an elastic collision has occurred, the probability 
distribution for the energy lost by the striking atom is 
given by (37), and thus the probability that the striking 
atom having energy «x will have an elastic collision in an 
energy interval dt about ¢ where />+, and that it loses 


energy dy about y<¢ is given by 
2Nxa? Nxa’dt dy 
(xt— tt 


ex —— 
| i 4ty? 


for (8?/ (+4?) <y<t, and is 0 for OS ySi6*/ (+47). 

Then the average number of displacements produced 
by the particle which had its first collision within an 
interval dt about ¢ is 


2Nxa? dtX Nra? 
(2! 1) Je (53) 


(S54) 


dyB? 
Petes: 


dyB? t 
atthe f 
(a,t62/(4+6%)1< 4ty? (55) 


f 
1p1/(p2-+-41) 4ty? 


If the striking atom reaches energy y without an 
elastic collision, then its first elastic collision is at 
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energy since it can, in our model, lose no more energy 
by inelastic collisions. The probability that the first 
collision is at energy y is given by 


md : »| 
exp] — xi—-y!) |. 
S 


Consequently g(x) can be expressed as follows: 


g(x) =g(y) expl—2Nma?(xt—-y!)/C] 


(56) 


z= Nrd@ 
A J dx exp — 2N2a?(x!—t*)/C] 
mn Ct 


t 


& 
x dy—g(t—9) 
wy(ae+_ey ty’ 


ee eid 
+f dy —aty~0)| (57) 
Ca, s1/(4e+ey), ty? 


In the right-hand side of the above expression, the 
first term represents the number of displacements 
produced by the particle after it had slowed down from 
the initial energy x to y, and the second term represents 
the number of displacements produced while the par- 
ticle is slowing down from x to y. 

The solution of (57) is found to satisfy the inequality : 


Ri Y—-Yita 
—(x—y+71)+———- S$ g(x) 
2a 2a 
Ro 1.15y+2a 
s—(s—-7)+——_, 
2a 2a 


(58) 


for ySxS20 Mev and Z2 16. R; and R; are defined by 
(120) and (123) respectively and are shown graphically 
on Fig. 2. 

The average of these bounds will be taken as the 
approximate solution for g(x) for x>y. Neglecting 
terms of minor importance, we have 


als eeviinn (59) 
g\x — sf Vw 


with R= (Ri+R:2)/2. 

It is clear that for x>v the slope of the function g(x) 
is substantially less than for x<y, and the curve seems 
to be nearly constant for low values of Z. 

The values of R, and R, are graphed for Z2 16 which 
we take as the lower limit of the range in Z for energies 
above . (See Fig. 2.) 

Refinements of the arguments given in the Appendix 
show that instead of being constant, R; and R, are 
decreasing functions of x. While g(x) increases the rate 
of increase is generally much less than for «<y. 

We can thus assume that g(x) is constant, ie., 


g(x) =g(y)~v/2a. (60) 






























































Fic. 2. Variation of R: and Rez with Z. 


IV. IRRADIATION BY NEUTRONS 


We assume that the solid is irradiated by a flux of 
neutrons of known spectrum and that the scattering is 
isotropic in center-of-mass coordinates. Let H(E,x)dx 
represent the number of primary knock-outs assumed 
initially as free, with energy in dx about x that result 
from a collision with a neutron having energy E. We 
have , 

H(E,x)=dx/nE for x<nE, 
H(E,x)=0 for x>nE, 


where n=4M/(M-+1)?. 
Consequently, the average number of displacements 
produced in the solid by a single collision of neutrons is 


(61) 


nE 1 nE—a 
G(E)= f H (Et) e(2—a)de=— f e(x)dx. (62) 


Substituting in (62) the simplified expressions for 
g(x) given by (42) and (60), we obtain 


G(E)~nE/4a_ for ESy/n, 


G(E)~[(nE—a)?— (1— RB) (nE—a—y)?]/4naE 
for E>y/m. 


(63) 


(64) 


We assume that the solid is irradiated by a flux 
comprising NV (£)dE neutrons having energy in dE about 
E and velocity v,=(2E/1.662X10-)!. Then the 
number of displacements produced by this flux in one 
cm? per second is 


j= f dpN (E)o(E)G(E)dE, (65) 
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where o(£) is the cross section for the collision of a 
neutron with the atom of the solid and the integral 
extends over the whole range of the neutron spectrum. 

For a neutron spectrum comprising only energies 
Es (A+1)6.25X 10° ev, the total amount of energy 
used up to produce displacements is equal to the half 
of the energy absorbed by the solid (for any solid 
having Z>3). 


APPENDIX 


For the range a7, we have from (7) the equation 
g(x)= f dyK (x,y)g(x—y)+ J dyK (x,y)g(y—a). (66) 
0 a 
Equation (66) is easily seen to be of Volterra type: 


gx)=sa)+ fo dyaetea)e0, (67) 


where (x,y) is non-negative and bounded on the range 
aSxSy,aSySyv. 

It follows from the standard theory of such equations 
that there is a unique solution of (66) given by the 
formula: 


g(x) = Fo(x) +X [Fn(x)—Fas(x)], (68) 
n=l 

where Fp is an arbitrary bounded integrable function 

on the range aXxSvy and 


Fanlad=s(e+ fo dyK (x,y) F 2(y), n=0, 1, 2, an ae 
: (69) 


Our method of obtaining an approximate solution of 
(166) depends upon the following theorem, which does 
not seem to be standard in the literature: 

Theorem I. If the kernel K of (67) is non-negative 
and bounded and if Fo(x)=Fi(x) on the interval 
[a,y ], then for every n=0, 


F,(x)2g(x), aSxSv. 


If the inequalities are reversed, the statement remains 
true. 

Proof. Clearly F,(x)2Fr-1(x), aSxSy implies 
Fasyi(x)=F,(x), aSxSy and the theorem follows from 
(68). 

Corollary I. If Fo(x)=g(x) on the interval [a,c] and 
if Fo(x)= F(x) on the interval [c,7] then Fo(x)= g(x) 
on the interval [c,y] also. If the inequalities are re- 
versed, the statement is also true. 

Proof. For x in the interval [c,y], we may rewrite 
(67) as 


g(a)={.(0)+ f aysete)e(o)|+ J ayaecesde(as, 
« c (70) 
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which is again of the same form as (67) but with the 
bracketed quantity as source term. Then, for x in the 
interval [c,7], 


| s(x)+ J dyK (x,y)g(y) + ; dyK (x,y) Fo(y) 


SFi(x)SFo(x), (71) 


and hence the result follows from the preceding 
theorem. 
For x in the interval [a,2a], Eq. (66) becomes [see 


(38) ]: 


1 x 
=1+- d 
gaa 14+ J v4(9), (72) 


and it is easy to verify that the exact solution is given by 
g(x)=1+-In(a#/a). (73) 


For x in the interval [2a,8], Eq. (66) becomes [see 
(39) ]: 


2 1 1 pr 
s)=—+- | d * dyg(y). (7 
g(x) te J se) +— J yg(y). (74) 


If we take Fo(x)= B(«x+a)/2a, then 
F(x) — Fo(x) =a(8—7B)/4a, (75) 


and this is positive or negative according as B<8/7 
or as B>8/7. 

To apply the corollary, we also need that for x in 
the interval [a,2a ], 


(76) 


x+a x 
B—sS1+ln- for a lower bound, 
2a a 


or that 


x+a x 
B——21+1n- for an upper bound. (77) 


2a a 


It is easy to verify that B=1 is the largest value of 
B for which (76) holds, and that (77) holds if B=8/7. 
Thus, we have, for x in the interval [a,6]], 


(x+a)/2aS g(x) S4(x+a)/7a. (78) 


For x in the interval [8,7], Eq. (66) becomes [see 
(40) ]: 
* dy6? x ad 2 
(=f 4(e-3)+ 
g eens 


y8 
[a,x1J> 4xy? 


where 2; is equal to ymin as defined by (31). 
For all elements Z>6 we have 6’>vy see (34) and 
thus (79) becomes 


zd B? 
(e)= f tet e0-21 (80) 
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An exact solution of (80) is given by 


x+a 
e(e)=3(—— ), B=any constant. (81) 
a 


Taking B=1 and applying Corollary I, we have 
Z>6, BSxSy 


(x-+a)/2aS g(2), | 
ZS56, BSxsp’. 


(82) 


For ZS$6, 8’SxSv, the equation determining g(x) 
becomes 


g(x)= f? ae ye f° ato) (83) 


but we may continue to write (83) as 
z dy? 
=f Lee) +e-0)]} 
t1 Axy? 


if we agree to define g(x) =0 for x<0. 

To obtain a lower bound for g(x), define 
f(x)=0, 
f(x)=1, 
f(x)=(w+a)/2a, aSxSp’—a; 


x<0; 


OSxSa; 


(85) 


fed 


xta k 
f(%)=—_——(«—B’+a), B’—a<x<S7; 
2 2a 


where k= constant. 

To determine R20 so that f(x) will be a lower 
bound, we note that f()<g(x) for x<#’, and so f(x) 
will be a lower bound also for 8’S xv provided 


* dyB* 
fz Sale s)+ 1-8) f(a)<0. (86) 


1 xy? 


Substituting for f(x) and integrating, (86) becomes 
= dtr x-ytay pedysy  y 
(Jha 
a Ay? 2a Avy. 2a 


“dy? y k dy? 
-f =~ f e-y-6'+0) 
ay 4xy? 2a Jad 2 4xy* 


(87) 


k 
sv: nm ’ < —p’+a)=0. 
Sz ss (y—6’) “( B'+a)2 


The first integral of (87) is small compared to the 
second and will be replaced by 
). (88) 


Ba of ef a eee 
ie Jo Saw? a 407" 2a 


‘16%? 4x 
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Substituting from (88) for the first integral and 
integrating in (87), the condition for a lower bound 
becomes 


Ba Ori 1 
at 
2a 2a xy 


a) leo. (89) 


k a 
+ 1 ata 

2a % Hy 
Multiplying by 8ax/6?, Eq. (89) becomes 


B'(x—B “le 


XX1 


1 tnt 1“ (90) 


2x? 2a 


An elementary argument shows that 
1—a/x+In[@’ (x—B’+a)/xx; | 
is an increasing function of x and is positive for «>. 
Dividing (90) by 1—a/x+In[6’(x—8’+a)/xx1], the 
condition becomes 


m 1+ (a@/x?)+1n (%1/2a) 
“1=(e/x)-+In[6" (x—6’-+a)/x01] 





=0. 


Since (91) is a decreasing function of x, (91) will be 
true for 8’SxSvy provided 


k= max 0 


For ZS$6, Eq. (92) gives the following values for k: 


—1—(e?/2y’)—In(71/2a) (02) 
1— (0/7) +In[e’ (y—B’+0)/yyr 

















xta k 
fe 8'+0)= fete) 


2a 


(93) 


for ZS6, B’SxSy. 
Since A(x+a)/2aS f(x), aSxSv, 
—[k(y—6’+a)/(y—a) ], we have 


A(x+a)/2aS g(x), OSxSy, 


where A=1 


(94) 


with A=1 for Z>6; and for ZS6, A is given by the 
accompanying table. 








Z 4 3 
A 0.8770 0.7542 
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Now consider a function 4(x) defined by 
h(x)=0, «<0; 
h(x)=1, OSxSa; 
h(x)=4(x+a)/7a, a<xSB—a; (95) 


8 x+a 
w(a)=(—+4)—, baci, 
7 2a 


where k= constant. 

Let 6* be the energy x, where x:=x—6+a. It is 
easily seen that B<6*<f” (see 35). Then for BS«<f* 
we have x1;—a>a, and so 


. x+a 3a—4a x ‘ i x 
wii (= ) +f "Sige x—a 2a “| 
(96) 


Hence, the condition h;(x)<h(x) is equivalent to 


3a—4x «4 x 1 x 2x(x+a) 
+— In——+h} — in ————~|s0. 
7x(x—a) Ta x—a 2a 6 af? 


Since In[x/(x—a)]<[e/(x—a)], Eq. (97) will be 
satisfied if 








3a 1 xs 2 
+4 — in-—— 150, As25/*. 
7x(x—«a) 2a B of? 





(98) 


For k>0, the left member of (98) is a decreasing 
function of x, and hence (98) is satisfied for Bx <t* if 





30? 
k=, (99) 
148 (8—a) 
Since B/a= 39.71, we have 
k=1.377X10- (100) 
as an admissible value of k, and thus 
8 xta 
(“+1377 10) =g(x), Bsxsp*. (101) 
a 
For 6* <x", we only need to add the term 
k z—Bt+a dyB* 
-~ —(x—y-+a) 
2a 71 4xy? 


to (97), so that on this range f(x) <h(x) is equivalent 
to 


3a—4x 4 x 
+— In 
7x(x—-a) 7a x-—a 








k Ly a B 
+n re |s°. (102) 
2al B(x—B+a) x x—Bt+a 
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Since *,<6*/4«, In[x/(x—a)]<[a/(x—a)], Eq. 


(102) will be true if 
B B—a 
4(x—B+a) x—B+a 


The function in square brackets in (103) has a maxi- 
mum at x=2(6—a), and this maximum is 


In(@/48—a)—1<a/(8—a) — 2.386. 
Thus (103) is satisfied if we take 


3a? 602 
k=2.7X10*>—> (104) 

78?” 78(8—a)[2.386—a/(8—a)] 
Thus 





+4{n |s°. (103) 
7x(x—a) 





8 x+a 
(+2710) 2400, asxsp". (105) 
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For 6” <x<vy we have to add the term 


‘. 2a dy 1 4 
oh hf Sob ts 
7a 2a Ta 


to (102). Thus, on the range 6’ SxS, hy(x)Sh(x) is 
true if 


3a—4x% Pe 4 
ions 2a . 


k XX a B 
+-[in — |so. (106) 
2al B(x—B+a) x x—B+a 


Making the same substitutions as before, we get 














8 B— 
+1-- ina+ 4 In ~ Js 
7 4(x—Bt+a) x—Bta 


(107) 
Since x2 8’">2(68—a), (107) isa decreasing function 


of x for x>6"’. Putting x=6” and using 8/a= 39.71, 
we get as an admissible value for k: 


7x(x—a 


. 602 
1—- In2+--——_ 
7 78"'(8"'—a) 
B-—a 4(6”—B+a) 
+n: 


B"—B+a B 


k=0.068> 





(108) 





Thus 


8 x+a 
(<+007)=""s g(x), aSxSy, Z>3. (109a) 
For Z>7, B’’ exceeds y and so 


8 x+a 
(42.7104) "2460, aSxsy, Z>7. (109b) 








fo 
ty 


g( 


fo! 








4) 


5) 


6) 


8) 


a) 


D) 





DISORDERING OF SOLIDS 


In the energy region x>v~, the equation to be solved 
is given by [see (57); we have substituted for 


(Nxa?/C)y*): 


g(a) =8) ex|-2(*) -1}4 f * (°) 
f-()-()] 


* dy8 
x J Get tK0-a)} (10) 


+ Qhe(-M2) 


where (6’):=a, exceeds 20 Mev if Z= 16 and [a,t; |,=h 
if we require Z2 16 and x<f’. 

It is easy to see that (110) may be written in the 
form (67) and is also an integral equation of Volterra 
type. A function f which is an upper (lower) bound for 
g in the region xSv and which satisfies 


g(v) exp{—2d[ (x/7)!—1]} 


The energy 


= din 
+} ee Ce 


xf StH-9+10- a} 
— f(x) S0(20) (111) 


ty 


will give an upper (lower) bound for g in the region 
vse. 

Now (111) is certainly true for «=y if f(y)2g(y) 
f(y) Sg(y)], and thus f will be an upper (lower) bound 
for x27 if f is an upper (lower) bound for x,y and if 
the derivative of (111) is negative (positive) for all 
L>Y. 

The same is true if we multiply (111) by 


exp{2A(x/)*— 1}. 


Thus we obtain the conditions f(x) = g(x), [ f(x) = g(x) ] 
for “a and 


f A e-04 50-01 
“1 4uy 


—f(x)- <0(20) (112) 


for an upper (lower) bound. 
Let 


f' (x) (wy)# 
r 


“+ 
f(a)=A——+B, ~n: 


R 
f(2)=—-(x-)+D, x27-N; 
2a 
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where A, B, R, and D are constants. Substituting in 
(112), the left side becomes 

















scalar (w—y+71) (y—1-+a) 
2aB— “— (A—R) In 
XX 
UA Pa erhrn D)2a—(A—R)y1 
“i s—ytn 
(D—B)2a—R(y+a) 
+ (113) 
¥~*r+e 
Taking 
Ry y—-yite 
B=0, A=1, D=—-+ ey 
2a 2a 


f(x)Sg(x) for xy, 


and substituting these values in (113), the condition 
for a lower bound is that for «27, 





ng? 
=1+(1-R) 1+ 
Aytel 


a = (%—y+71)(y—-y1+e) 
1—-+l1n 
x XX 





x | Jeo. (114) 





Taking A=1.15, B=0.425, D=1+(1.15y/2a), and 
substituting in (113), the condition for an upper bound 
is that for x2y, 





























0.85aX re? v1 
= 115+ (1.15-R)| 1+ {1—"+ 
(xy)? 4yixtl ox y—vita 
v1 («#—y+71) (y-1+e) 
_ +n |] so (115) 
2—9t71 x2} 
Since 
5 1414 aea 
—>—=—+—>~—, -<.-, 
Fm Ff Pe 7 
Eq. (114) will be satisfied for x2 if 
2 
~14+(1-R)| 1+ 
4yix! 
a  4(x—y+7)(y—v1+a) 
x |1—“in ||z0; (116) 
x 
and (115) will be satisfied for a2y if 
85aX Ae? v1 
—1.15+(1.15— -R|1+— 1+ - 
Y 4yixtl = =-y—y +a 
Y S(a—-y+71) (y—-y1te 
i 1 wi 1) ( 1 ‘\]z0. (117) 
x—Y+"1 2 
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Now «x occurs in (116) and (117) only in the form 





Tin: 


oi t(*—-y+71) are 
%—Y+71 Be 


H(x)= ro 
(118) 


with a nearly 1, d either 0 or yi, and r=4 or 5. 
The derivative of (118) is 


Sb+y— 71, 2b(y¥—71) 
7 
x—yty1 (x—y+1)? 


t(x—y+71) (y¥—-71+ a) 
—31n : 
B 


It is easily seen that the bracketed quantity in (119) 
is a decreasing function of x which is positive at x=7 
and negative at «=, and thus (119) changes sign 
exactly once for «>v¥. 

It follows that both (116) and (117) are concave 
downward for x27 and hence attain a maximum at 
some value of %max>y while the minimum is attained 
at one end of the range of x. 

Thus for a lower bound on the range y S26 we may 
take R= R, where 





20-8] 2— 30+ 





| (119) 


1 
"Qa 1+ (62/474) min[H (y),H(6) 11 


where H is defined by (118). : 

It is not difficult to show that (119) is negative for 
x=4y/3, and thus the value 2max where H attains its 
maximum satisfies %max<4y/3. Since 


b 7(#—y+1)(y—11+a) 
+1n 


t—Y+71 6° 





(120) 





a— 
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is an increasing function of x, we have 


Ti 





7 
H (%max)<T= r| + 
y+3m. y—-v1+ea 


Ms (+311) (y¥—-v1+a) 

36? 
and hence for an upper bound, we take R= Re where 
1.15— (0.85ad/7) 
1+ (8%/4y/)T 


Neglecting the term (0.85a\/7), which is very small, 
we may take 





| (121) 





1.15—R2= (122) 


$15 


Pipe emer. (123) 
1+ (4y3/\8°7) 


Thus 


Ri oe 
—(*x—y+71)+———S g(x) 
2a 2 


a 


Ro 1.15y 
s—(s—7)+1+ ’ 
2a 2a 


(124) 


where R; is the value of R defined by (121) and R, the 
value or R defined by (123). 

The values of R; and R, are plotted in Fig. 2, with 
the value 5=800y/A which corresponds to an energy 
of 20 Mev. 

The authors wish to express their thanks to Dr. S. 
Campbell for a critical reading of the manuscript and 
for supervising the numerical calculations which were 
done by Miss Mary Todd. Both Dr. Campbell and 
Miss Todd are members of the Oak Ridge National 
Laboratory Mathematics Panel. 
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The Drude-Zener theory of optical absorption by free carriers is applied to the infrared absorption of 
n-type germanium and p-type silicon. Average effective masses so determined are: for electrons in germanium 
((m*/m))y ranges from 0.11 to 0.22; for holes in silicon ((m*/m))a, ranges from 0.19 to 0.55. The average 
effective mass values of electrons in germanium are in good agreement with those measured by cyclotron 
resonance. The infrared absorption bands of p-type germanium are explained on the basis of transitions of 
holes between three energy bands lying near the top of the valence band. This band structure is suggested by 
cyclotron resonance experiments. Application of the theory to p-type silicon leads to the prediction of an 
absorption peak near 25u and two lesser ones near 33y. 





1, INTRODUCTION 


ECENT experiments on infrared absorption in 
germanium and silicon have shown that there is a 
marked dependence of the absorption on the type and 
concentration of the carriers present. In n-type ger- 
manium! and in samples of silicon** indicated to be 
p-type, the absorption follows the Drude-Zener fre- 
quency dependence, but the magnitudes of the absorp- 
tion coefficients calculated using the free electron mass 
are too small by factors between 10? and 10%. We 
attribute this to effective masses lighter than that of the 
free electron. In p-type germanium?*~’ absorption 
bands are observed. It has been suggested that this 
absorption is caused by interband transitions of holes.® 
In Sec. 2, we estimate from the infrared data the 
effective masses of carriers in germanium and silicon 
which follow the Drude-Zener behavior. In Sec. 3, we 
consider the optical absorption of p-type germanium by 
interband transitions using a simple model suggested by 
the results of cyclotron resonance experiments.*—"! 


2. OPTICAL ABSORPTION BY FREE CARRIERS 


The Drude-Zener theory of free carrier absorption” 
predicts an absorption coefficient 


An Neér 
K=—.—___.,, (1) 
nc m*(1+w*r*) 
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where w= 2rv is the angular frequency of the radiation, 
n the refractive index of the medium, m* the effective 
mass of the carriers of concentration V, and 7 the mean 
relaxation time of the carriers. This is developed on the 
assumption of a spherically symmetric, nondegenerate 
energy surface. In the case of germanium the lowest 
conduction band has minima along the (111) directions 
of k space, near which the energy surfaces are prolate 
spheroidal." The longitudinal effective mass is denoted 
by m, and the transverse by m,. If the relaxation time is 
isotropic, Eq. (1) is still valid if we replace m* by 


(m*)y where 
1 ie? 2) 
ie 
{m*\y, 3\m, mi 


The mobility u=e7/(m*), is introduced. In the region 
where w*7?>>1 the absorption coefficient can then be 


written 
K=Neé/(ner(m*)y?v?u). (2) 


Figure 1 shows typical absorption coefficients of 
n-type germanium samples in the infrared, as deter- 
mined by Fan and Becker.! The sharp rise at about 1.84 
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Fic. 1. Absorption coefficient of samples of n-type germanium as 
observed by Fan and Becker! (log-log plot), 
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is associated with electronic transitions from the highest 
valence band to the lowest conduction band. At wave- 
lengths longer than 6yu, the absorption is found to vary 
directly with 1/7’. 

The data of Conwell'* were used for the determination 
of N and yp in Eq. (2). By using Fig. 7 of reference 13, 
which gives conductivity vs temperature for specimens 
of known carrier concentration, it was possible to esti- 
mate NV when the conductivities of the samples were 
given. Figure 5 of reference 13 gives mobility as a 
function of temperature for the various carrier concen- 
trations. With these values, we find values of (m*),/m 
by fitting Eq. (2) to the observed absorption curves. 
The values so determined range from 0.11 to 0.22, as 
shown in Table I. 

The values of m,* and m,* for n-type germanium in 
cyclotron resonance experiments!" near 4°K are 1.3m 
and 0.08m respectively. This predicts for (m*),,/m the 
value 0.12, which is close to the values given in Table I. 

The infrared absorption of p-type silicon has the same 
characteristics as that of n-type germanium. On the 
long wavelength side of the energy gap transition, the 
absorption appears to be that of the Drude-Zener type. 
Equation (12) is fitted to the experimental curves and 
values of (m*),, obtained. Mobilities and concentrations 
for p-type silicon were obtained from the paper of 
Pearson and Bardeen.'* The values of (m*)«/m so 
obtained are listed in Table II with the sources of the 
data. Values range from 0.19 to 0.55. 

Tolpygo" finds the free carrier absorption by solving 
the Boltzmann: equation for the electronic velocity 
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Fic. 2. Absorption coefficient of a sample of p-type germanium 
of resistivity 0.07 ohm-cm, after Kaiser, Collins, and Fan’ (log-log 
plot). 


18 E. M. Conwell, Proc. Inst. Radio Engrs. 40, 1327 (1952). 
4G. L. Pearson and J. Bardeen, Phys. Rev. 75, 865 (1949). 
16K. B. Tolpygo, Zhur. Eksptl. i Teort. Fiz. 22, 378 (1952). 
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TABLE I. Average effective masses of electrons in germanium as 
determined by Eq. (2). 








Calculated 
. Temperature Resistivity values of 
Samples of Fan and Becker* °K (Q-cm) = (m*)ay/m 


300 0.005 0.11 
0.02 0.12 
0.1 0.20 
5.0 0.14 


Sample of Collins and Fan. N=2.8X10" electrons/cm? at all 
temperatures given. 
Calculated 


values of 


Temperature 
°K (m*)ny/m 


77 
300 
353 
379 
439 








® See reference 1. 
b See reference 2. 


distribution with radiation present. The current is 
calculated with the use of a relaxation time inversely 
proportional to the velocity. He obtains expressions for 
the optical constants and interprets the results of Briggs 
to give a value of (m*),,/m=0.25. 

Cyclotron resonance experiments on p-type silicon'® 
indicate that there are two bands at the top of the 
valence band, degenerate at k=0 and having approxi- 
mate effective masses of 0.17 and 0.50. If both bands 
have the same relaxation time the cyclotron resonance 
values would predict a value of (m*),,/m=0.38, which is 
within our spread of values. One should expect such a 
band structure to have an absorption band, as shown in 
Sec. 3. It is there shown that the peak of this band is 
expected to occur at 25u, which is outside the range of 
experimental values quoted by Becker and Fan® and 
Briggs.‘ 


3. OPTICAL ABSORPTION BY INTERBAND 
TRANSITIONS 

The problem of electronic transitions from one energy 
band to another induced by electromagnetic radiation 
has been treated by Wilson."’ He finds the average of the 
current density in the presence of radiation, to the first 
order in the applied fields. We adapt his results to obtain 
the conductivity caused by transitions from band i to 
band j. Wilson’s Eq. (194) gives for the conductivity 
the expression 


weh 


o=— 


mu 


5(wis—w) [viewed N.(k)dk. (3) 


In the above, w is the angular frequency of the applied 
radiation which has polarization vector e, w;;=LE;(k) 


16 Dexter, Lax, Kip, and Dresselhaus, Phys. Rev. 96, 222 (1954). 
17 A. H. Wilson, Theory of Metals (University Press, Cambridge, 
1936), first edition, pp. 126-131, especially Eq. (194), 
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INFRARED ABSORPTION OF CARRIERS IN Ge AND Si 


—E;,(k) ]/h is the frequency difference between bands i 
and j, the w’s are the respective Bloch functions, and 
\,(k)dk is the number of electrons of band 7 per unit 
volume lying in volume dk of k-space. The transitions 
are vertical, that is, k is not changed in the transition. 
The frequency difference w;; is to be looked upon as a 
function of k. One integration may be performed by 
setting dk =dSdw;;/|Vxw:;|, where dS is an element of 
surface of constant w,;. The conductivity at frequency w 


becomes 
if focen 2N; af) - (4) 


| Views 

The integral is to be evaluated at w;;=w. The absorption 
coeficient is then obtained through the relation 
K p= 4m ;;/ nc. 

In p-type germanium the infrared absorption spec- 
trum consists of bands. Typical curves are shown in 
Fig. 2. At 300°K, there are peaks near 500 cm™, 2200 
m-!, and 3500 cm~'. As the temperature is lowered the 
band in the neighborhood of 2200 cm diminishes and 
has disappeared at 77°K; the band at 3500 cm™ 
becomes sharper and higher with the peak moving 
toward longer wavelengths; the long wavelength band 
beginning at about 1800 cm™ becomes narrower and 
higher. In addition the strong absorption near 1.8y is 
observed as in n-type germanium. This is due to the 
energy gap jump as before. It will be shown that the 
model for p-type germanium proposed by Dresselhaus, 
Kip, and Kittel® will predict infrared absorption of the 
observed character and of the correct order of magnitude. 

According to this model, the top edge of the valence 
band consists of two bands, almost spherically sym- 
metric, with effective masses m,*=0.3m and m»* 
=0.04m. These bands touch at k=O at a point of 
symmetry of type I'st in the notation of Elliott.'* Each 
of these bands is twofold degenerate if we consider spin. 
The four wave functions at k=0 have the symmetry 





TABLE II. Average effective masses of holes in silicon as 
determined by Eq. (2) 








Resistivity 


Samples of Fan and Becker,* 
(ohm-cm) 


T=300°K (m*) ny/m 


0.014 0.55 
0.032 0.30 
0.075 0.19 
0.5 0.31 


Impurity content Resistivity 


Samples of Briggs, 
P (ohm-cm) 


T=300°K (percent B) 


0.0005 0.03 
0.001 0.012 
0.002 0.007 
0.003 0.005 
0.005 0.004 
0.01 0.0015 


(m*) »y/m 











* See reference 1. 
> See reference 4. 


'8 R, J. Elliott, Phys. Rev. 96, 266, 280 (1954). 
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Fic. 3. Proposed structure of the uppermost valence bands of 
peer ey in the neighborhood of k=0. The Bohr unit of length 
1S dop= 





properties of atomic ; orbitals arranged with opposite 
signs on the two fcc sublattices of the diamond lattice. 
At an energy AE lower is another band, spherically 
symmetric and doubly degenerate. The wave functions 
at k=O are of type I';+, having the symmetry of ; 
orbitals with opposite signs on the two sublattices. The 
separation AE is caused by spin-orbit interaction. This 
structure is illustrated in Fig. 3. Optical transitions 
between these bands are forbidden at k=0. Away from 
k=0, this selection rule breaks down and electric dipole 
transitions can occur. From such a model, it is easy to 
understand the origin of absorption bands. Near k=0, 
few transitions can occur because of the selection rule. 
Farther from k=0, absorption can take place if irradia- 
tion of the proper frequency is present. Still farther 
from k=0, the Maxwellian distribution function for the 
holes will become small and few holes will be present to 
make transitions. Thus one should expect to find 
absorption bands with widths strongly dependent on 
temperature. 

We now calculate the absorption coefficients K;; 
according to Eq. (23) with some simplifying assumptions 
concerning the band structure and matrix elements. 
First we-treat transitions from band 1 to band 2. 

An expression for the matrix element will be obtained. 
The one-electron Schrédinger equation for the stationary 
states with spin-orbit interaction included is 

he 


h 
——VW+ V (hy +—_S.-[vV (1) XpW= Ay. 
2m 2m*c? 


Since VV has the same periodicity as V, the Bloch form 
y=,e**' is preserved. The equation for the « function 
is - 


1 
en sai mF a (VV Xp)uxt+hk 


hk? 
[2+—1sx07) Ju=[20-—|s Ux. (5) 


m 2m m 
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We are interested in points near k=0 and accordingly 
treat the left-hand term of Eq. (5) enclosed in square 
brackets as a perturbation. The second term in the 
square brackets is of order v/c while the first is of order 1 
in the nonrelativistic approximation. Hence, we use only 
the first term. In any event, the two terms have the 
same symmetry and will admix the same wave functions. 

The Schrédinger equation will be considered solved 
for k=0. We proceed away from k=0 by applying the 
perturbation (/m)k-p. Let |\,°) be a u« function at 
k=0 belonging to energy E,°. A Latin subscript will 
indicate the degenerate states of level \. The first order 


u function is 
h _ u;°|k-p]d.°) 


[As*)=|AY)+— |). (6) 
E,’—E, 


Mm Bi 


To apply this standard perturbation theory, the |),°) 
functions must be the correct linear combinations: in 
this case those which diagonalize the energy to second 
order in k. This has been treated by Shockley. The 
condition is 


e {A;°|k-p|am Xam? | k-p|A.°) 
E,°—E,° 





m2 am 


W ;(k) is the second-order change of energy produced by 
the operator (i/m)k-p. Dresselhaus ef al.° and Elliott!® 
have found W (k) for diamond-type crystals in terms of 
a few fundamental parameters. They obtain for holes: 


h? A+2B A-B 2 
E(k)=| —+ Je+|(—) ke 
2m 3 3 

C?—(A—B)? 
i cwvitertsipneinioael 


4 
(babtege+behe] (8) 
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Fic. 4. Calculated optical absorption cross section of holes in 
germanium at 300°K (log-log plot). 
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where the constants A, B, and C are determined by 
matrix elements and energy differences between the 
initial band and bands of other symmetry. These same 
constants appear in the expression for the matrix 
element for interband transitions. The matrix element 
for transitions between bands degenerate at k=0 is 
obtained from Eq. (6). It is given by 


Ark] e-p|As*)=(A| &-p]A.°) 
h _ frd|e-p|u?Xus?|k-p|a.°) 
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For diamond-type lattices, the term of zero order in k of 
Eq. (9) will vanish. This follows from the presence of 
inversion symmetry which causes all, wave functions 
belonging to a degenerate level at k=0 to have the same 
parity. 

One can obtain a selection rule for the direction of 
polarization e. If e is parallel to k, the terms in Eq. (9) 
become proportional to the sum in Eq. (7). Thus the 
transition element between different bands will vanish. 

We make the approximation that bands 1 and 2 are 
spherically symmetric. 


E,=h?k?/(2m,*), E.=h?k?/(2m2*). (10) 
For the case of germanium, this assumption will corre- 
spond to a spread of absorption frequencies of 10 per- 
cent. With this assumption, it is then possible to carry 
out the integration of Eq. (4). For Ni(k) we use the 


Maxwellian distribution function, 


he 


kT) 


where JN, is the total number of holes in band 1. In 
evaluating Eq. (4) we need | Viwi2| which is given by 
hk(m,*—mz2*)/m,*m,*. All of these factors are intro- 
duced and the integration carried out. The matrix 
element is proportional to k and the square of its 
absolute value averaged over all directions is written 
h?k’ Ay", where Ais is a dimensionless parameter. The 
absorption coefficient is then found to be 
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In this expression, the induced emission from band 2 to 
band 1 has been omitted. This is calculated by similar 
application of Eq. (4) and subtracted from the above 
absorption. V and NV; are given in terms of N, the total 
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number of carriers by the relations: 
N N N (m2*/ m*)4 
© 1+ (m*/m2)" 1+ (na*/mi*) 


The absorption coefficients for all transitions between 
bands 1, 2, and 3 are given by 
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In the above, #=v/c is the wave number of the incident 
light. The energy gap at k=0 between bands 1 and 3 is 
AE=hciy. The restrictions given in the parentheses 
indicate the ranges of 7 to which the formulas apply. 
They follow from the energy separations between bands. 
Going away from k=0, v2 increases from zero; 713 
increases, starting at AE/h; ve; decreases, starting at 
AE/h. No induced emission occurs from band 3 as we 
shall set AE at a value much greater than kT. Equations 
(12), (13), and (14) are evaluated using m,*=0.3m, 
m,*=(0.04m, as determined by cyclotron resonance. It is 
not possible to get an exact fit to the bands found 
experimentally, but a qualitative agreement is obtained 
if we take AE=2400 cm™=0.3 ev and m;*=0.1m. If 
the spin-orbit splitting, AH, is small compared to the 
energy gap between valence and conduction bands, then 


ty 1 1 -1 
LGBT 
2\m* m* 
Estimating the correction due to AE, we get m3;*0.1m. 


The A constants of Eqs. (12) to (14) are chosen to fit 
the maxima at T=300°K. The values chosen are 


Ay.=5.2, A13=2.99, Ag = 90. 


An estimate of the matrix element for transition 1 to 2 
based on the form of the wave functions at k=0 is 


estimated approximately by 


(15) 


—Cm 
Ke pleat sind, 


where C is the same constant as in Eq. (8). Lax et al. find 
by cyclotron resonance C= —32 in units of h?/2m. From 
these figures we obtain A1.=6.4, which is in excellent 
agreement with the value obtained by interpretation of 
the optical absorption. 

Plots of the computed absorption coefficients at 
300°K and 77°K are given in Figs. 4 and 5. The absorp- 
tion coefficients are expressed per unit total carrier 
concentration, giving them the dimensions of a cross 
section. Also included in these figures is the free carrier 
absorption of the holes in bands 1 and 2. This absorption 
is found in the same way as was done in Sec. 2 with the 
added assumption that both bands have the same 
relaxation time r. The relaxation time was taken to be 
of the order of 10-” sec for both bands. For samples as 
highly doped as those reported on, 7 will not change by 
more than a factor of 2 as T varies from 300°K to 77°K. 

When we compare Fig. 4 with the experimental curve, 
we see that the correct number of peaks occur, but that 
the peak of K12 overlaps the other peaks. This overlap is 
at about 2000 cm™, a region for which the transition is 
for holes with wave vector k~0.03a,~!, where apo is the 
Bohr unit of length (0.529 A). This point is quite far out 
from the center of the Brillouin zone and it is likely that 
the energy there is no longer proportional to k* as we 
have assumed. If the effective mass of band 2 were to 
increase from the very small value of 0.04m, bands 1 and 
2 would not diverge so rapidly and the Ki. peak would 
fall of more rapidly on the short wavelength side, in 
better agreement with experiment. At 77°K, Fig. 5, we 
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see that the peak of Ke; has become very narrow and 
lower in magnitude. In the experimental plot, the band 
with which we identify K»2; actually disappears at 77°K. 
The other bands have the correct qualitative form. The 
Kye peak goes to a higher maximum than at 300°K and 
falls off more sharply on the short wavelength side. 
Absorption band K;; narrows and decreases in mag- 
nitude. 

From Fig. 2 it is seen that the absorption at 5°K is 
very much like that at 77°K. At liquid helium tempera- 
tures the carriers lie bound in impurity levels, An 
acceptor level in germanium has a mean radius of about 
13a. We have here used an effective mass of 0.3m. This 
impurity state, when analyzed into plane waves, will 
principally contain k’s lying within a sphere of radius 
0.2a;“' in k space. This spread of k is of the same order 
of magnitude as the spread of occupied states of band 1 
at T=77°K, as seen from Fig. 3. If the Bloch functions 
are at all like plane waves, we may expect the matrix 
elements for absorption from band 1 to be of the same 
magnitude as those for absorption from the impurity 
levels. Hence we should expect the absorption at liquid 
helium temperatures to be similar to that™ at 77°K. 

We see from Fig. 5 that at about 80 microns and 
farther in the infrared the free carrier absorption 
dominates. This is a possible explanation of the rise in 
the absorption coefficient at this wavelength as observed 
by Johnson and Spitzer.”! 

From the above theory, it appears that we should 


® This argument is due to Dr. C. Herring (private communi- 
cation). 

21 FE, J. Johnson and W. G. Spitzer, Phys. Rev. 94, 1415 (1954); 
Purdue Semiconductor Research Progress Report, October 1, 1953 
(unpublished), pp. 50-54. 
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expect p-type silicon to have infrared absorption bands. 
Application of Eq. (12) with effective masses 0.50m and 
0.17m, as determined by Lax et a/.,!° predicts a maximum 
for band Ky. at 25u for T=300°K. From their data we 
find A 1:?~ 2.56, which leads to an absorption coefficient 
K12=5X10-*N at the maximum of the band. Reported 
measurements of infrared absorption coefficient of p- 
silicon extend only as far as 10 and are of the Drude- 
Zener type. In the region between 10u and 4y the band 
K x2 falls off rapidly but has the same order of magnitude 
as the observed absorption. In the case of germanium it 
was found that the band Ki» did not fall off rapidly 
enough on the high frequency side. If this fault is a 
result of the approximations made and if it also occurs in 
application to silicon, we may expect that further 
experimental investigation will locate this band. Spin- 
orbit splittings in atomic silicon are roughly $ those of 
germanium.” Hence we should expect bands Ki; and 
Kez to occur near 33 in p-type silicon if they are 
sufficiently high to be seen. 
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A new treatment of the Auger effect using methods made available by modern high-speed digital com- 
puters is reported. These results have been used to evaluate the fluorescence yield in the lighter elements 
A, Kr, and Ag. The results, 0.13, 0.67, and 0.85 respectively, compare very favorably with the best available 


experimental values. 





THEORETICAL study of the Auger effect 

among the lighter elements has just been com- 
pleted, and the numerical results thereof will soon be 
made available.! Auger and x-ray transition rates have 
been computed for three atoms: argon (Z=18), 
krypton (Z=36), and silver (Z=47). All of the com- 
putations were performed on the high-speed electronic 
digital computer of the Graduate College, University 
of Illinois. 

For each of the atoms investigated, a preliminary 
self-consistent field calculation was carried out using 
the Hartree method (without exchange).? The electronic 
wave functions thus derived were used for the evalua- 
tion of matrix elements in the subsequent calculations. 
Auger transition rates were calculated by means of 
the nonrelativistic theory first proposed by Wentzel,’ 
and subsequently used by other workers.** Radiation 
rates were calculated in the dipole approximation.’ 

The K-series fluorescence yield wx has been calcu- 
lated from the computed transition rates for each of the 
three atoms we have investigated. Table I is a com- 
parison of these theoretical predictions with earlier 
theoretical results’ and with the experiments.®® 
The quoted experimental values for the fluorescence 
yield are rather uncertain, but seem generally to be 
somewhat smaller than the theory predicts. 

In the case of silver, this discrepancy between the 
theory and experiment (if indeed such exists) may be 
due to our neglect of relativity. Using a relativistic 
theory of the Auger effect, Burhop and Massey” 
carried out trial calculations for silver and found that 
K-series Auger transition rates should be increased 
by about 20 percent over the values calculated from 

* Assisted in part by the U. S. Office of Naval Research. 

t Now at Shell Development Company, Houston, Texas. 

1R. A. Rubenstein and J. N. Snyder (to be ublished). 

?D. R. Hartree, Proc. Cambridge Phil. Soc. 24, 89, 111 (1928). 
D. R. Hartree, Proc. Roy. Soc. eee) A141, 282” (1933). 

7G. Wentzel, Zz. Physik. 43, 524 (1927). 

4E. H.S. Burhop, Proc. Roy. Soc. (London) -_ 272 (1935). 

5L. Pincherle, Nuovo cimento 12, 81, 162 (1935). 
usar) Ramberg and F. K. Richtmeyer, Phys. Rev. 51, 913 

TL. I. Schiff, Quantum Mechanics (McGraw-Hill Book Com- 
pany, Inc., New York, 1949), first edition, p. 255. 

E. H. S. Burhop, The Auger Effect (T The University Press, 
Cambridge, 1952). 
Broyles, Ra nena and Haynes, Phys. Rev. 89, 715 (1953). 
FE. H. S. Burhop and H. S. W. Massey, Proc. Roy. Soc. 
(London) A153, 661 (1936). 


the nonrelativistic theory.‘ Dipole radiation rates were 
found to be virtually unaffected by relativity. If the 
Burhop and Massey estimate of relativistic effects is 
applied to our results for silver, the K-series fluorescence 
yield is reduced from 0.85 to about 0.83, in excellent 
agreement with the most reliable of the experimental 
measurements. In this respect our calculation does not 
suffer from one characteristic of the earlier results, for 
which consideration of relativistic effects tended to 


TABLE I. K-series fluorescence yields in the 
low to medium range of atomic number. 








Fluorescence yield 
Previous Experimental 
theory* estimate> 


Present 
theory 


Atomic 


Atom number 


Argon 18 
Krypton 36 
Silver 47 





0.06 to 0.149 (0.11°) 
0.51 to 0.66 (0.6°) 
0.72 to 0.88 (0.834) 


0.08 
0.57 
0.79 


0.13 
0.67 
0.85 








® See reference 8. 

b The extremes among the various measurements are given. The numbers 
in parentheses are considered here to be the best available experimental 
estimate for the fluorescence yield. 

The value quoted here was obtained by rough interpolation on an 
experimental curve. (See reference 9.) 

4 Measurement by coincidence techniques and magnetic spectrometer 
(Huber, Humbel, Schneider, and de-Shalit, Helv. Phys. Acta 25, 3 (1952) ]. 
This is considered by Broyles, Thomas, and Haynes (see reference 9) to be 
the most reliable of the presently available methods for measuring the 
fluorescence yield. 


worsen rather than improve agreement with experiment. 
(The earlier nonrelativistic estimate for wx was 0.79,8 
and would be reduced to 0.76 when these effects are 
considered.) 

For the lighter atoms argon and krypton, relativistic 
corrections are probably unimportant, and the dis- 
crepancy between theory and experiment must be 
otherwise explained. A tendency was found for the 
Auger transition rates to increase with atomic number, 
as the effects of screening on the innermost electrons 
of the atom become relatively less important. Our 
results could be explained on the basis of our having 
overestimated these screening effects. Thus, the 
calculated Auger transition rates would be somewhat 
smaller than the correct values, and the calculated 
fluorescence yield correspondingly too large. 

On the whole, these latest calculations of K-series 
fluorescence yields for argon, krypton, and silver are 
in qualitative agreement with experiment. 
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Variation of Line Width with Rotational State and Temperature in the 
Microwave Spectrum of OCS* 
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The earlier measurements on the self-broadening of OCS rotational transitions by Johnson and Slager 
have been extended. Measurements have been made at constant pressure on seven rotational transitions 
in the ground vibrational state from J =3—4 to J=15—16 and at dry ice and room temperatures. The data 
have been interpreted by a simple theory which assumes a constant contribution to line width from dipole- 
dipole interactions and an additional J-dependent contribution from rotational resonance interactions. 
Empirically determined temperature dependence is introduced for each contribution. This interpretation, 
when fit to the observed line widths, Av results in the expression 


_ 1815 
ae 


at 1-mm Hg pressure. The collision diameters for OCS are found to be approximately linear functions of the 


Av +0.0032173 (J +-1)%e-*BJ J+D/ET Mc/sec 


rotational state and independent of temperature. 





INTRODUCTION 


HILE the general aspects are understood of the 
process by which microwave spectral lines are 
broadened by molecular collisions, there does not exist 
complete understanding of the details of these mecha- 
nisms. Theoretical explanations have been particularly 
hampered by a lack of sufficient experimental informa- 
tion on collision broadening. The data reported herein 
represent an attempt to provide further experimental 
information on the broadening process. 

Collision broadening of microwave spectral lines 
was first reported in detail by Bleaney and Penrose! 
who measured the self-broadening of ammonia as a 
function of transition. Their results were later verified 
by Potter, Bushkovitch, and Rouse.? Theoretical ex- 
planations of the results of Bleaney and Penrose were 
immediately attempted by Margenau® and by 
Anderson.* 

While considerable information was subsequently 
published on collision broadening by foreign gases, the 
next self-broadening data reported were on oxygen. 
Anderson, Smith, and Gordy,' Gokhale and Standberg,® 
and Artman and Gordon’ gave experimental evidence 
for the variation of line width with rotational transition 
in oxygen. At about the same time Leslie,? Margenau,?® 


* This research was supported by the U. S. Air Force under a 
contract monitored by the Office of Scientific Research, Air 
Research and Development Command. 

{ Permanent address: Department of Physics, University of 
Maryland, College Park, Maryland. 

1B. Bleaney and R. P. Penrose, Proc. Phys. Soc. (London) 
B59, 418 (1947). 

2 Potter, Bushkovitch, and Rouse, Phys. Rev. 83, 987 (1951). 

3H. Margenau, Phys. Rev. 76, 121 (1949). 

4P. W. Anderson, Phys. Rev. 76, 647 (1949); 80, 511 (1950). 

5 Anderson, Smith, and Gordy, Phys. Rev. 82, 264 (1951); 
87, 561 (1952). 

( 6 i. Gokhale and M. W. P. Strandberg, Phys. Rev. 84, 844 
1951). 

7J. O. Artman and J. P. Gordon, Phys. Rev. 87, 227 (1952). 

®D. C. M. Leslie, Phil. Mag. 42, 37 (1951). 

*H. Margenau, Phys. Rev. 82, 156 (1951). 


and Mizushima” published new theories to explain 
collision broadening. The most recent experimental 
data on the effect of rotational transitions upon self- 
broadening were reported by Johnson and Slager," 
and Feeny, Lackner, Moser, and Smith” who studied 
this behavior in the OCS molecule. 

Observation of the variation of line width with 
temperature has been limited, however experimental 
data has been given by Howard and Smith for 
ammonia,” Beringer and Castle for oxygen,’ Johnson 
and Slager,"' and Feeny ef al."* for OCS, and Hill and 
Gordy for oxygen.!® 

Experimental observations of collision self-broadening 
should take into account the following factors: (1) 
the molecule being observed should be simple, prefer- 
ably diatomic or at least linear; (2) rotational transi- 
tions of the electric dipole type should be observed; 
(3) measurements should be made over a wide range 
of transitions; (4) temperature dependence of line 
width should be determined for a constant number of 
molecules and also for constant pressure. The chief 
experimental difficulty in line-width measurements 
is the third point. Observation of a large number of 
transitions usually requires measurements over a wide 
band of wavelengths extending into the millimeter 
wavelength region. Ammonia and oxygen, however, 
are exceptions to this because of the peculiar origin of 
their microwave spectra. The ammonia spectrum is of 
the inversion type, while the oxygen spectrum arises 
from spin-spin interactions of the unneutralized 
electrons in this paramagnetic molecule. As a result 
the microwave spectrum of ammonia contains many 


10M. Mizushima, Phys. Rev. 74, 705 (1948); 83, 94 (1951); 
84, 362 (1951). 

11 C, M. Johnson and D. M. Slager, Phys. Rev. 87, 677 (1952). 

12 Feeny, Lackner, Moser, and Smith, J. Chem. Phys. 22, 79 


1954). 
13R, R. Howard and W. V. Smith, Phys. Rev. 77, 840 (1950). 
“R, Beringer and J. G. Castle, Phys. Rev. 81, 82 (1951). 

16 R. M. Hill and W. Gordy, Phys. Rev. 93, 1019 (1954). 
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transitions over a relatively short wavelength region 
from 1.3 to 1.8 cm while the oxygen spectrum is con- 
fined primarily to the region from 4.5 to 5.6 mm. 
With these two molecules it is then possible to measure 
line widths for a number of transitions in a narrow 
wavelength region, thus considerably reducing the 
experimental difficulties. For this reason the first 
measurements of line width as a function of transition 
were performed on ammonia and oxygen. 

Despite the desirable experimental features of 
measurements on ammonia and oxygen, it is difficult 
to arrive at a simple theoretical explanation of the 
observed results because of the unusual origin of these 
spectra. For this reason it is desirable to make measure- 
ments on molecules of simple structure whose absorp- 
tions arise from simple, rotational transitions. Data 
from such measurements should prove of value in 
theoretical explanation of broadening effects. Accord- 
ingly, the sulfur carbonyl molecule was chosen for 
this investigation. Earlier measurements on this mole- 
cule!!2 indicated the dependence of line width on 
rotational transition and temperature. The develop- 
ment of experimental techniques for the generation of 
wavelengths down to one millimeter'®'” has made 
possible the extension of line-width measurements to 
higher rotational transitions. 


APPARATUS AND PROCEDURE 


The difficulties of measuring the half-intensity 
half-widths of spectral lines, according to the definition 
of the line width, are alleviated by making accurate 
measurements of the inflection points of the intensity 
curve and relating these points to the line width. 
Location of inflection points may be determined by 
observing the derivative of the intensity curve with 
respect to frequency. If one assumes small absorption 
by the molecule, it is easily demonstrated that the 
frequency separation of the inflection points 2é6y is 
related to the line width Av defined as the half-width 
at half-intensity by Av=v36v. In the following section, 
however, it will be shown that this result is not accu- 
rate for the case where the molecule absorbs an appreci- 
able amount of the incident energy. It will accordingly, 
be modified to correct for this effect. 

The technique of the measurement of line widths by 
the derivative method has been widely used.*:!!:!2.18 
For weak transitions the derivative shape is usually 
recorded, however, it is more convenient to use cathode- 
ray oscilloscope presentation for direct measurement. 

A block diagram of the apparatus is given in Fig. 1. 
The microwave equipment consists of a K-band 
klystron which supplies energy to a crystal multiplier,!” 
here used to generate energy as high as the eighth 
harmonic of the klystron output. By means of appropri- 
ate transition sections this energy, at wavelengths 


16 W. C. King and W. Gordy, Phys. Rev. 90, 319 (1953). 


17 W. C. King and W. Gordy, Phys. Rev. 93, 407 (1954). 
18 R. R. Howard and W. V. Smith, Phys. Rev. 79, 128 (1950). 
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Fic. 1. Block diagram of the apparatus. 


down to 1.54 mm, is applied to a wave guide absorp- 
tion cell. The transmitted energy is detected by a special 
silicon detector.!’ In the present work three different 
absorption cells were employed. Two were K-band cells 
of lengths 100 cm and 10 cm, and one wasaG-band cell 
of length 17 cm. The shorter K-band cell was so con- 
structed that it could be completely immersed in a dry 
ice-acetone bath, so that all absorbing molecules would 
be at a temperature of 195°K. 

The first derivative of the line shape is obtained by 
superimposing a small (~0.1v) sinusoidal voltage on 
the sawtooth modulation applied to the klystron 
reflector. This frequency modulated klystron output 
results in an absorption signal incident on the detector 
crystal which is proportional to the slope of the line 
intensity curve at the unmodulated frequency of the 
klystron. The smaller the frequency modulation, the 
more closely the regulating signal will represent the 
first derivative of the line shape. 

Earlier techniques used 100 kc/sec sinusoidal 
reflector modulation."!* The signal emergent from the 
crystal detector, containing a 100 kc/sec component, 
was amplified in a communication receiver and then 
detected. The signal which resulted was proportional 
to the absolute value of the first derivative of the 
intensity. This technique was initially employed for 
the present measurements, but was soon abandoned 
for several reasons. First, for the very weak, high- 
transition lines, this method was found extremely 
insensitive. Second, the line width was found to vary 
with frequency modulation amplitude. Third, the line 
width varied with the sweep rate of the sawtooth 
modulation. In an effort to eliminate these deficiencies, 
particularly the loss of sensitivity, modifications were 
made to this early technique. First, a frequency 
modulation of 4 kc/sec was used. Second, the amplified 
crystal output was not detected. The result was a 
significant increase in sensitivity, so much so as to 
exceed the signal strength of the line seen by con- 
ventional video search techniques.’ The crystal 
detector signal, containing a 4 kc/sec component was 


1 This modulation method is not recommended as a search 
technique, since the adjustment of modulation voltage amplitude 
and of amplifier frequency is critical and can only be made on a 
line already known to exist. The method is similar to that de- 
scribed by W. Gordy and M. Kessler, Phys. Rev. 72, 644 (1947). 
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Fic. 2. Photograph of the cathode-ray oscilloscope presentation 
of the J=7—8 transition occurring at 97 301.32 Mc/sec, the 
fourth harmonic of the klystron frequency. 


amplified in an M.I.T. Radiation Laboratory type 
TAA-16B twin-T amplifier tuned for 4 kc/sec. The 
undetected output of this amplifier was fed directly to 
the vertical deflection plates of a cathode-ray oscillo- 
scope. The resultant pattern is seen in Fig. 2. The 
oscillating portion occurs at a 4 kc/sec rate and the 
envelope of this oscillation is the first derivative of 
the line shape. Feeny ef al.,” investigating the effect of 
modulation frequency on observed line width, have 
shown that the use of frequencies higher than 10 kc/sec 
produces a systematic increase in line width with fre- 
quency increase. Their data indicate that the use of 
4 kc/sec modulation frequency results in a good ap- 
proximation to the true measured line width, dy, as the 
modulation frequency approaches zero. Thus it is 
unnecessary to extrapolate line width measurements to 
zero modulation frequency as they do. 

While the low-frequency modulation technique 
greatly improved sensitivity and eliminated the effect 
of sweep rate on line width, the effect of frequency 
modulation amplitude on line width still remained. 
Accordingly measurements were made on the effect 
of modulation amplitude on line width. The results are 
shown in Fig. 3. The parabolic shape of the curve 
indicates that measurements on line width made using 
modulation amplitudes below 0.1v rms will evidence 
an inherent systematic broadening error of less than 
1 percent of the true line width. Consequently all 
measurements were made using the minimum possible 
modulating voltage and in no case did this exceed 
0.1v rms. 

The measurement of the frequency separation of 
the inflection points was achieved through the use of 
a movable marker superimposed as an intensity modu- 
lation of the cathode-ray trace of the derivative of the 
absorption line. This marker was derived from the 
beat note produced between a frequency standard 
output and the sawtooth frequency-swept klystron. 
This beat note, of about 25 Mc/sec, was amplified in a 
communication receiver. The receiver output, which 
was used to blank the oscilloscope trace, was applied 


to the oscilloscope through a clipping circuit which 
made possible width control of the blanked-out portion 
of the trace. This marker may be seen in Fig. 2. The 
position of the marker could be changed by tuning 
the receiver. In making the width measurement, the 
marker position was measured as a function of receiver 
dial reading as the marker was placed on top of each 
of the maxima of the derivative curve. The receiver 
dial was then calibrated by observing the location of 
calibration points separated by 1 Mc/sec. Interpola- 
tion was used between calibration points. 

The procedure used in making a measurement was 
as follows: (1) By observation of the line, the amplitude 
of the modulating voltage was set to the smallest 
possible value less than 0.iv rms. It was carefully 
determined that this represented the minimum line 
width. (2) Measurement of pressure (~10-' mm) was 
made with a McLeod gauge and this used to calibrate 
a vacuum thermocouple gauge. (3) Ten to twenty 
width ‘measurements were made. (4) The receiver 
was calibrated. (5) Remeasurement was made of the 
pressure. (6) Absorption cell temperature was observed. 
(7) Dry ice was placed around the absorption cell and 
the first six steps were repeated. 


RESULTS 


As mentioned in the preceding section it is necessary 
to correct the observed line widths, dy, for the effects 
of large absorption by the molecules.” At the higher 
frequencies and with long absorption cells energy ab- 
sorption by the molecules becomes very large, approach- 
ing total absorption of the incident energy at the line 
center. Because of this fact the shape of the absorption 
line becomes distorted from the usual Lorentz shape. 
The wings of the line absorb relatively more energy 
and the line becomes broadened and flattened near its 
center. The method of correcting for this effect is given 
in the Appendix. 

The results of the measurements are given in Table I. 
The line widths are expressed in terms of the width the 
line would have if the pressure were 1 mm Hg. This form 
of expression is based upon the assumption that the 
line width is directly proportional to pressure. This 
is an established experimental fact for this range of 
pressures.*-?8 Reduction of data to this expression 
is necessary because of the fact that most of the 
measurements were made at different times, with 
different pressures, and with different absorption cells. 
The data are shown graphically in Fig. 4. 
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*” The author is indebted to Dr. W. V. Smith for pointing out 
the need for this correction. 
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TaBLE I. Transition frequencies, line widths, and collision diameters for the OCS molecule. Frequencies are those of King and Gordy 
(see reference 17). Figures in the Cell column designate wave-guide size (K- or G-band) and length of absorption cell in cm. Measurements 
designated K-214 are those of Johnson and Slager (see reference 11), and F, those of Feeny, ef al. (see reference 12). Line-width correc- 
tions are calculated from Eq. (10). Corrected line widths, divided by pressure, express results uniformly at 1-mm Hg pressure. Line 
width values in parentheses are mean values for that temperature. Collision diameters are calculated from mean line-width. (Note: 
ry a immersion of the G-17 cell in dry ice caused slight narrowing of lines. Collision diameters at 195°K are based only on 
K-10 data. 
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done by using the expression!® 
Av=4v2nib", 


As correction for line broadening resulting from 
molecular collisions with the walls of the absorption 
cell, the formula of Danos and Geschwind”! was con- 


sidered. When these corrections were applied to the 
[data it appeared that, for the smaller G-band wave 
guide, the corrections were excessive and resulted in 
discrepancies between data obtained on the same transi- 
tions but in different absorption cells. Since, without 
considering wall collisions, the measurements in the two 
different size wave guides agreed fairly well, these cor- 
rections were not applied. 

Since the mechanism of line broadening is a collision 
process, it is informative to determine collision diam- 
tters according to the kinetic theory. This has been 


*1M. Danos and S. Geschwind, Phys. Rev. 91, 1159 (1953). 


where m is the number of molecules per cm’, é is the 
mean relative velocity of the molecules, and d is the 
collision diameter of the molecule (which may here be 
considered as the molecular diameter”). The mean 
relative velocity of the molecules, d, is given by* 


0=2(2RuT/rM)}, 


where Ry=8.31X10" erg/°C-mole, T is the absolute 
temperature, and M is the molecular weight of OCS. 


2 E. H. Kennard, Kinetic Theory of Gases (McGraw-Hill Book 
Company, Inc., New York, 1937), p. 103. 
3 See reference 22, pp. 45 and 112, 
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Fic, 4. Line width as a function of transition and temperature. 
Solid points represent experimental observations at dry ice tem- 
perature (195°K). Circles are data at room temperature (300°K). 
The curves are plotted from Eq. (5). Crosses are calculations made 
by Smith ef al. from their theory (see reference 27). 


The resultant values for the collision diameter, b, are 
given in Table I. 

The effect of temperature on the line width is also 
shown in these data. It should be emphasized, however, 
that these figures represent the variation of line width 
with temperature for constant pressure, since the ab- 
sorption cell, at dry ice temperature, was connected to 
the filling system which was at room temperature. The 
filling system acted as a reservoir to maintain constant 
pressure. . 

For the estimation of errors in these measurements, 
in addition to random errors, it is necessary to consider 
effects resulting in systematic broadening. While the 
influence of modulation frequency and amplitude on 
observed line widths has been minimized, there may 
exist a broadening of not more than 2 percent. Other 
broadening up to 2 percent may result from the method 
of presentation of the line’*!7 and the proximity of 
harmonics of the measured line. However, there was 
observed, no effect due to klystron sweep frequency or 
amplitude. The other errors involved are all of random 
nature; these errors, of amounts indicated, arise in 
frequency measurement by the receiver (1 percent), 
calibration of the receiver (primarily from interpolation 
of receiver dial settings) (1 percent), and the measure- 
ment of pressure (2 percent). Other random errors, 
estimated at 3 percent, may arise from the curvature of 
the klystron mode or from the small reflections present 
in the absorption cell. The total systematic error is then 
about 4 percent and the total random error, 7 percent. 
The errors are reported in percent because of the fact 
that the data were obtained by multiplying the measure- 
ments by the harmonic multiple of the klystron fre- 
quency. Thus the random error in the J=3—>4 meas- 
urement is 0.5 (Mc/sec) per mm and in the J=13-14 
measurement is 1.3 (Mc/sec) per mm, 
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INTERPRETATION 


The interpretation of these experimental results has 
not been possible by any existing theory. The most 
rigorous theory available for explanation of self- 
broadening, that of Anderson,‘ shows that the line 
width, for linear molecules in low rotational quantum 
states, varies linearly with J. At higher rotational 
states this variation is less than linear because of the 
effect of the Boltzmann factor. Both the experimental 
results presented here and those of Johnson and Slager"' 
show a quadratic variation of line width with J. 

The simplified theory given by Smith and Howard™ 
takes into account dipole-dipole and dipole-quadrupole 
interactions between molecules. According to their 
interpretation an interaction may only exist in the 
dipole-dipole collision if a symmetric- or asymmetric-top 
colliding molecule is involved, since only these mole- 
cules possess dipole moments when averaged over all 
orientations. While the self-broadening of linear mole- 
cules could not be accounted for by this interpretation, 
there exists the possibility of interpretation by a 
dipole-quadrupole interaction. In view of more recent 
experimental evidence presented by Benesch and 
Elder this latter mechanism has not been invoked. 
Similarly the theory of Mizushima,” based upon 
quadrupole-quadrupole interaction, has not been used. 
This interaction, which is much smaller than dipole 
interactions, predicts a narrowing of the absorption 
line with higher rotational state. 

Benesch and Elder report high-resolution measure- 
ments in the infrared region on foreign gas broadening 
of vibration-rotation lines of HCl and CH. While the 
spectra which they observe could not satisfy the Van 
Vleck-Weisskopf condition that the duration of the 
collision be short compared with the period of the 
incident radiation,?® as well as not being similar in 
origin to microwave spectra, their Fig. 1. gives strong 
evidence for the fact that in the microwave and infrared 
regions collision broadening takes place by the same 
mechanism. They also present evidence in their Fig. 2 
for the interpretation that, contrary to the results 
found by Anderson‘ and Smith,* dipole-dipole inter- 
actions are important in collision broadening, irrespec- 
tive of the configuration of the broadening molecule. 
They conclude that: (1) at least two interactions are 
necessary to account for the broadening of a dipole 
absorber by a series of foreign broadening gases; (2) 
for foreign broadening gases which do not have dipole 
or quadrupole moments a polarizability interaction 
should be considered; (3) the broadening mechanism 
depends primarily upon a property of the broadening 
molecule and only to a lesser degree on one possessed 
by the absorbing molecule; and (4) for broadening 
molecules having quadrupole or dipole moments, the 


% W. V. Smith and R. R. Howard, Phys. Rev. 79, 132 (1950). 

25 W. Benesch and T. Elder, Phys. Rev. 91, 308 (1953). 

% J. H. VanVleck and V. F, Weisskopf, Revs. Modern Phys. 17, 
227 (1945), 
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existence of strong broadening effects requires a more 
powerful interaction than polarizability forces between 
colliding molecules. 

Recent theoretical work by Smith, Lackner, and 
Volkov?” has shown that the magnitude of the self- 
broadening of OCS can be explained by a first-order 
dipole-dipole interaction using Anderson’s‘ general 
theory. They indicate, however, that the observed tem- 
perature dependence of self-broadening must’ be ac- 
counted for with higher order interactions in addition 
to a dipole-dipole interaction of first order. These re- 
sults are in agreement with the conclusions of Benesch 
and Elder. 

Several examples of self-broadening now exist in 
the literature. All of these have shown self-broadening 
effects greater than their foreign broadening effects*®; 
these molecules are: NH;, OCS, HCN, CICN, Oz, and 
HCl.* (The latter molecule has only been observed in 
the infrared; HCN has been observed in both the 
infrared and microwave regions.) To account for this 
anomalous behavior in NH; and Oz resonance effects 
have been postulated.*:® 

According to all collision broadening theories, 
collisions may be either weak or strong. The weak or 
adiabatic collision results from perturbations of the 
initial and final energy levels of the absorbing molecule 
produced by a Stark effect due to the proximity of 
the broadening molecule. A weak collision results in a 
phase shift of the radiation, thus broadening the line. 
The strong or nonadiabatic collision results from a 
complete interruption of the molecular absorption 
by having the absorbing molecule undergo an induced 
transition caused by collision with the broadening 
molecule. This type of broadening is explained by the 
Uncertainty Principle, since, in AEAtZ2h, the state 
lifetimes At are distributed statistically, resulting in a 
distribution of AE= Ar for a line width. 

For microwave broadening, only strong collisions 
need be considered.™ This results from the explanations 
for saturation effects and also from the fact that no 
shift in frequency of the maximum of a microwave 
line occurs when pressure is increased. Frequency 
shifts, which occur in optical measurements, are 
[characteristic of weak collision. Further evidence for 
the predominance of strong, transition-producing 
.f collisions is that the energy loss involved in an absorp- 
tion is much less than the thermal energy of the mole- 
cule, that is, hv&3kT for microwave spectra. Thus 
thermal collisions may involve relatively large energy 
transfers. 

The explanation of the observed behavior of the line 
width as a function of J will be based upon the assump- 
tion of two contributing interactions, dipole-dipole and 
resonance. It will be assumed that the contributions 
to line width by dipole and other interactions are con- 


” Smith, Lackner, and Volkov, J. Chem. Phys. (to be published). 
% Gordy, Smith, and Trambarulo, Microwave Spectroscopy 
(John Wiley and Sons, Inc., New York, 1953), pp. 192-193. 
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stant. Added to these contributions will be a resonance 
interaction term. 

For the resonance term assume a collision-induced 
resonance transition. The lifetime of a molecule in the 
state J should be inversely proportional to the prob- 
ability By7--s41 of a collision-induced transition and 
to the difference in the populations V;—Ny of the 
initial and final states. The state lifetime is inversely 
proportional to line width,?® so that the line width 
contribution due to resonance interaction may be 
expressed by 


Av,= const Byars (Nyg—Nv). (1) 


The transition probability is given by” 


8x" 
Byox'=—(J,M |p| J',M)?. 
3h? 


Considering subsequently only the case where J’=J +1, 
the square of the matrix element is, for all orientations,” 


(J,M |u| J+1, M)?=w(J+1)/(2I+1). 
Thus the transition probability will be expressed by 


8x2 / J+1 
By4s41= (—). (2) 
3h? \2J+1 


The difference in the populations of the lower and 
upper states is given by the Boltzmann distribution in 
the approximate form” 


Ny—NgEN shvyay[kT. 
For a linear molecule the transition frequency is 
VJ aJ+1= 2B(J+ 1). 


The number NV, of molecules in the state J is given by 
the rotational Boltzmann distribution whose weight 
factor and state sum include the sum over all values of 
M (which was not included in the square of the matrix 
element),®! 


Ns=N(AB/RT)(2I +1) e384 St 1k, 
Thus 
Ny—NyS(2NVB/PT?) 
X (J4+1)(2I+ 1B HV /kT, (3) 


Substitution of Eqs. (2) and (3) into Eq. (1) gives the 
resonance term 


Av,=const (J-+-1)%e-ABI IH/k7, (4) 


A term of similar form is obtained for the transition 
J-J—1. 


2 See reference 28, p. 186. 

*® See reference 28, p. 201. 

31 G, Herzberg, Spectra of Diatomic Molecules (D. Van Nostrand 
Company, Inc., New York, 1950), p. 125. 
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The line width may then be expressed by 
Av=Ava+Ar,, 


where Avg is a constant representing the line width 
contribution resulting from the dipole-dipole inter- 
action. 

This resulting equation was found to fit well the dry- 
ice and room temperature data considered separately, 
but the temperature dependence indicated by Eq. (3) 
was not observed. For low J-values, where the resonance 
contribution is insignificant, the line widths at constant 
pressures are observed to follow an inverse temperature 
dependence.” For higher J-values the resonance term 
was found to vary as the square-root of the tempera- 
ture. By empirical fitting of an equation, considering 
these temperature effects, it was found that the data 
for both temperatures was expressed by the general 
formula: 


Av=1815/7+0.0032173(J+1)%e-* BY Jt) /kT 
(Mc/sec) permm. (5) 


For comparison, Eq. (5) is plotted on Fig. 4 with the 
experimental data. No justification can be given for 
either temperature dependence. 

The general form of Eq. (5), when extrapolated to 
high J-values, is seen to be similar to the experimental 
results obtained by Lindholm from the vibrational 
spectra of HCl and HCN.” The effect of the Boltzmann 
distribution is already indicated in the present data. 
The maximum line widths from Eq. (5) will occur at 
J=24 at 195°K and J=31 for 300°K. 

There are indicated on Fig. 4 the four points calcu- 
lated by Smith ef al. from their rigorous theory.”’ There is 
excellent agreement between these values and the ex- 
perimental data. It is hoped that their calculations will 
be extended to the higher J-values. 

It is noteworthy that the collision diameters b of OCS, 
measured at constant pressure, are linearly dependent 
upon J and almost independent of temperature. The 
observed collision diameters are at least twice the 
kinetic theory diameters.”® 

It is a pleasure to acknowledge the aid and encourage- 
ment of Professor Walter Gordy in this work and the 
contribution of Charles A. Burrus, Jr., in obtaining the 
short-wavelength radiation necessary for these measure- 
ments. The author should like to express his gratitude 
to Professor William V. Smith for furnishing pre- 
publication information on his theoretical work and for 
suggesting important corrections. 


APPENDIX 


The correction of observed line widths for the dis- 
tortion in line shape produced by extreme absorption 
of the microwave energy will now be considered. The 
measurements of line width are made directly on the 
response of the crystal detector which indicates the loss 


® See Figure 2 of reference 4a. 


ANDERSON 


in microwave power on transmission through the ab- 
sorbing medium contained in the wave-guide cell. The 
method of correction has been given by Howard.* 
The absorbed power is expressed by 


P=P,(1— 7, (6) 


where P is the absorbed power; Po, the incident power; 
l, the absorption cell length; and a, absorption coeffi- 
cient is given by 


(Av)? 
a =aoS=ag . (7) 
wAv[_(v—v0)?+ (Ar)?] 


Here ap is the peak absorption for the molecule; S, the 
Lorentz line shape factor; v, the observation frequency ; 
vo, the frequency of line center; and Ay, the line width. 

For most measurements the exponential argument is 
small, so that the absorbed power is given by 


P=Pyal= PoapSl, 


and the signal from the crystal detector is directly 
proportional to the Lorentz shape factor. Thus the line 
shape is undistorted. For a long absorption cell or for 
large ao, this approximation is no longer valid and the 
exponential form must be used. 

For a linear molecule the peak absorption coefficient 
is given by* 





4’ hN yw? 
aii ee 


ao= ABI (J+) (kT (8) 


3ck’T? 


Thus at high frequencies the peak absorption coefficient 
may become large. 

The apparatus measures the frequency separation, 
26, of the inflection points of the absorption line, that 
is the points where 

?P/dr=0. (9) 


However, it is desired to determine the line half-width 
at half-power, Av, and thus it is necessary to determine 
the relations between observed dv and the desired Av. 
This may be derived by applying Eq. (9) to Eqs. (6), 
(7), and (8) and using series expansions on this result. 
The relation obtained is then 


Av=v36v[1—aol/4+ (aol)®/128+- + +]. 


It may then be shown by using only the first two terms, 
that the line width will be given by 


Av=V3dv—3.70X 10-9 (v5 u2pl/T*)e-*BI J+D/kT (10) 


where y is the transition frequency in Mc/sec; p, the 
dipole moment of the molecule in Debye units; 9, the 
pressure in mm Hg; and /, the absorption cell length in 
cm. This formula was used to obtain the corrections 
given in Table I. 

%R. R. Howard, Ph.D. thesis, Duke University, 1950 (un- 


published). 
% See reference 28, p. 186, 204-6. 
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The J =0— 1 rotational transitions of TCl and TBr have been 
measured at 1.36-mm and 1.74-mm wavelengths, respectively. 
These measurements, with a, and D from infrared data, yield the 
following constants: 


For tritium chloride, 
TCI 


222 143.78+.0.4 Mc/sec 
111 075.76 Mc/sec 


TCI? 


221 195.40+.0.4 Mc/sec 
110 601.53 Mc/sec 

111 550.6 Mc/sec 
1.28002A 

1.27456A 

53.0+0.6 Mc/sec 


vo(0—1) 
By 


B. 112 032.0 Mc/sec 
ro 1.28003A 


Te 
€Qq(Cl) 


1.27456A 
67.0+0.6 Mc/sec 


For tritium bromide, 
TBr® 
vo(0—1) 172 499.05+0.3 Mc/sec 
Bo 86 252.24 Mc/sec 
B, 86 947.2 Mc/sec 
ro 1.42012A 


te 1.41443A 
eQq(Br) 


53042 Mc/sec 
From the TBr measurements and similar measurements on DBr 
made previously in this laboratory, the mass ratio mr/(mp) 
= 1.49747 was obtained. 


TBr® 


172 343.2340.4 Mc/sec 
86 174.33 Mc/sec 
86 868.8 Mc/sec 
1.42011A 
1.41442A 
443-+2 Mc/sec 





INTRODUCTION 


HE present work represents a part of a program to 

measure precisely the spectra of light simple 
molecules which have their first rotational lines 
occurring in the shorter millimeter or upper sub- 
millimeter range. The hydrogen halides DBr and DI 
have already been measured.! In the present paper 
results on TCl and TBr are reported. The latter isotopic 
forms, to our knowledge, have not previously been 
investigated with optical spectroscopy although the 
hydrogen and deuterium species have been thoroughly 
studied with infrared spectroscopy. 


EXPERIMENTAL RESULTS 


The methods of generating, detection, and measure- 
ment of millimeter wave frequencies are essentially 
the same as those already described in papers from Duke 
University.’ 

A dc bias on the multiplier crystal was used to ad- 
vantage in some of the present observations. Although 
no improvement over the optimum unbiased perform- 
ance was obtained, the bias proved to be helpful in 
arriving more rapidly at the optimum performance. It 
is of greatest advantage when the klystron output is 
below that needed for best multiplication. 

For the TBr measurements, the seventh harmonic 
from a 2K33 klystron was employed while for TC] the 
ninth harmonic of the same tube was used. Frequency 
measurements were made at the fundamental output 
with a secondary frequency standard monitored by 
Station WWV. 


t This research was supported by the U. S. Air Force through 
the Office of Scientific Research of the Air Research and Develop- 
ment Command and by the U. S. Atomic Energy Commission. 

1C. A. Burrus and W. Gordy, Phys. Rev. 92, 1437 (1954). 

2 W. Gordy and C. A. Burrus, Phys. Rev. 93, 419 (1954). 

3 W. C. King and W. Gordy, Phys. Rev. 90, 319 (1953); 93, 
407 (1954). 


The advantages of the shorter millimeter wave 
spectroscopy for the study of radioactive gases are 
demonstrated by the present measurements made with 
a G-band wave-guide cell 15 cm in length. The total 
volume of the cell was only 0.2 cc. Figure 1 shows the 
J=0-—1 lines of TCI*’ observed in this cell. A crystal 
video detector with a wide band (10-kc) amplifier and 
a 60 cps sweep was employed. These lines could, of 
course, be recorded with a narrow-band amplifier at a 
signal-to-noise ratio better than 10 times that shown. 
The concentration of TCI’ in the sample is normal, 25 
percent, and the pressure is of the order of 10-? mm 
of Hg. Collisions with the cell walls probably broaden 
the lines here. 

The samples of tritium chloride and tritium bromide 
were prepared in the Oak Ridge National Laboratory 
by the action of ultraviolet light on mixtures of tritium 
gas and the halide. ; 


TRITIUM CHLORIDE 


The first rotational line of TC] falls at 1.3 mm, and is 
split into two sets of three components by the Cl 
isotopic shift and by the Cl nuclear quadrupole inter- 
action. Table I gives the measured frequencies. From 
an analysis of the spectrum, unperturbed rotational 
frequencies, vo, and nuclear quadrupole couplings, 


TaBLeE I. J=0—1 transition of tritium chloride. 








Frequency in Mc/sec 
Observed Calculateds 


222 130.32+0.4 222 130.38 
222 147.23+0.4 222 147.13 
222 160.50+0.4 222 160.53 


221 184.82--0.4 221 184.80 
221 198.00-0.4 221 198.05 
221 208.69-0.4 221 208.65 


FF’ 


3/2-3/2 
3/2-5/2 
3/2-1/2 


3/2-3/2 
3/2-5/2 
3/2-1/2 





TCr 


TCP 








8 Calculated with the »o and coupling constants listed in Table II. 
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Frc. 1. Oscilloscope tracing of the J=0—1 rotational transition (1.36-mm wavelength) of TCI’ with Cl’ in its natural abundance 
of 25 percent. The absorption cell employed was of 15 cm length and 0.2 cc total volume. 


eQq, listed in Table II were obtained. These were then 
used to calculate a comparative spectrum for showing 
the degree of consistency of the results. Methods of 
analysis of such a spectrum are described in detail 
elsewhere‘ and will not be repeated here. 

For molecules rotating at the high frequencies of the 
one-to-two millimeter wave range, the centrifugal 
stretching effects are not entirely negligible even in the 
J=1 state. Because only one rotational line was ob- 
served, we could not measure the stretching constant 
Dy. Fortunately, D; can be calculated from the infrared 
values for HCl or DCI to the accuracy needed to correct 
for the small stretching in the J=1 state. Similarly, 
the zero point vibration effects which could not be 
measured in the present work are calculable from the 
infrared results. The Bo value for a J=0—1 transition 
is connected to the vp by 


Bo= $+ 2D,. 


Pickworth and Thompson‘ give for DCl*, D;=1.371 
X10 cm and for DCI’, Ds=1.363K10- cm 
which in Mc/sec are 4.110 and 4.086, respectively. 
We convert these to the TC] values of 1.934 and 1.917 
Mc/sec by multiplication with the square of the 


‘Gordy, Smith, and Trambarulo, Microwave Spectroscopy 
(John Wiley and Sons, Inc., New York, 1953). 

5 J. Pickworth and H. W. Thompson, Proc. Roy. Soc. (London) 
A218, 37 (1953). 


reduced mass ratios [u(DCl)/u(TCl) P. The equilibrium 
value, B,, is given by 


B.= Bota. 


To obtain a,, we converted the values 0.1123 cm™ and 
0.1118 cm~, respectively, of DCI** and DCI*’, (from 
Pickworth and Thompson‘) to the corresponding TC! 
values by multiplication with the factor [u(DCl)/ 
u(TCl)}#. The B, values which we give are limited by 
the a, values thus obtained to six or perhaps five 
significant figures. The internuclear distances calculated 
from the By and B, values are given in Table II. The 
r, value, 1.274, A, in DCI] measured by Pickworth and 


TABLE II. Characteristic constants of tritium chloride.* 











TCI TCI" 
eQq(Cl) 67.0+0.6 Mc/sec 53.0+0.6 Mc/sec 
(0-1) 222 143.78+-0.4 Mc/sec 221 195.40+-0.4 Mc/sec 
Bo 111 075.76 Mc/sec 110 601.53 Mc/sec 
D 1.934 Mc/sec> 1.917 Mc/sec> 
Qe 1912 Mc/sec> 1898 Mc/sec> 
B, 112 032.0 Mc/sec 111 550.6 Mc/sec 
ro 1.28003A 1.28002A 
fe 1.27456A 1.27456A 





® Masses used for calculations: H =1.008145 amu; D =2.014741 amu; 
T =3.016997 amu; Cl* =34.980064 amu; Cl*? =36.977675 amu. 

> Values of D and a are calculated from the corresponding values for 
DC obtained by infrared spectroncoey by H. W. Thompson and J. Pick- 
worth, Proc. Roy. Soc. (London) A218, 37 (1953). 
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MILLIMETER WAVE SPECTRA OF TCl1 AND TBr 


Thompson is in excellent agreement with the microwave 
value. 

Perhaps the information of most interest obtained 
in the present work is the Cl nuclear quadrupole 
coupling in TC] (which will, of course, be essentially 
that in HCl). Hydrogen chloride has intermediate 
ionic character whereas most simple diatomic molecules 
for which couplings have been observed have either 
very low ionic character or almost completely ionic 
character. Furthermore, the complication of possible 
double bond character is out for bonds to hydrogen. 

From Heitler-London theory, with various amounts of 
hybridization assumed for the Cl orbital, Schatz® has 
predicted that the Cl* coupling in HCl may be of the 
order of 110 Mc/sec. The measured value, 67 Mc/sec, 
is more nearly in agreement with that, 61 Mc/sec, 
predicted’ from a simple ionic character-electronega- 
tivity relation® derived empirically from nuclear 
quadrupole coupling data with all hybridization effects 
neglected. The solid state value 53.3 Mc/sec obtained 
from the pure quadrupole spectrum’ of HC! is 20 
percent lower. A lower value is expected for the solid 
because of the increased ionic character caused by the 
dipole-dipole interaction of the closely spaced molecules 
of the solid. 

The unbalanced electron number, U,, of Cl in 
gaseous HC] can be obtained with the aid of the coupling 
per unbalanced p electron —110 mc/sec, obtained 
from atomic beam experiments.” It is U,=67/110 
=0.61. If hybridization and overlap charge effects 
are neglected, the ionic character of the HCl bond is 
determined by the U,, as 39 percent. The ionic character 
estimated from electronegativity is 45 percent if the 
Pauling x values, 2.1 and 3.0, are employed for H and 
Cl, or 42.5 percent if the values 2.13 and 2.98 derived 
from force constants" are used. 


TRITIUM BROMIDE 


Table III gives the calculated and observed lines of 
the J=0-—1 transition of TBr” and TBr*. In Table 
IV, the rotational constants and other information is 


TABLE III. J=0—1 transition of tritium bromide. 








Frequency in Mc/sec 





FF’ Observed Calculated* 

3/2-1/2 172 366.65+0.3 172 366.55 

‘Ter? 3/2-5/2 172 472.430.3 172 472.55 
3/2-93/2 172 605.06+0.3 172 605.05 

3/2-1/2 172 232.46+0.3 172 232.53 

TBr® 3/2-95/2 172 321.11+0.3 172 321.13 
3/2-3/2 172 430.85+1.0 172 431.83 











® Calculated with the »o and force constants listed in Table IV. 


6 P, N. Schatz, J. Chem. Phys. 22, 695 (1954). 
™W. Gordy, J. Chem. Phys. 22, 1417 uss) 
8 W. Gordy, J. Chem. Phys. 19, 792 (1951) 
: wey Livingston, see eo in reference 4 
1% V. Jaccarino and J. G. King, Phys. —, 85, Pan (1951). 
1 W. Gordy, J. Chem. Phys. os, 305 (1946 
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TasLe IV. Characteristic constants of tritium bromide.* 

















TBr* TBra 
(Br) §30+2 Mc/sec 43342 Mc/sec 
vo(O—1) 172 499.05+-0.3 Mc/sec 172 343.23+0.4 Mc/sec 
Bo 86 252.24 Mc/sec 86 174.33 Mc/sec 
D 1.36 Mc/sec> 1.36 Mc/sec> 
Qe 1390 Mc/sec> 1389 Mc/sec> 
B, 86 947.2 Mc/sec 86 868.8 Mc/sec 
To 1.42012A 1.42011A 
Ye 1.41443A 1.41442A 
mr/mp 1.497471¢ 1.497464° 
® Values of D and a are calculated from the corresponding values for 


HBr obtained by infrared spectroscopy by Thompson, Williams, and 
Callomon, Spectrochim. Acta 5, 311 (1952). 
b The B's employed for DBr were obrained from the Bo of reference 2 
and the a’s calculated from HBr data of reference 12. 
© Masses used for calculations: hydrogen—see Table II; Br? =78.94438 
amu; Br®! =80.94228 amu. 


TaBLE V. Comparison of nuclear oe coupling 

















in gaseous TBr and D 
Coupling in Mc/sec 
Nucleus TBr DBr 
Br? 530+2* 53343> 
Br® 4434-28 445+3> 
® This work. 


b W. Gordy and C. A. Burrus, Phys. Rev. 93, 419 (1954). 


summarized. The auxiliary constants, a, and D,, were 
calculated from infrared data as described for TCI. 

In Table V, the nuclear quadrupole couplings are 
compared with those of DBr previously obtained. 
Through an error in transcription, the eQg for Br®™ in 
DBr was listed as 455 Mc/sec in the earlier work 
rather than as 445 mc/sec, the value which was obtained 
from the spectrum. The TBr couplings are slightly 
smaller for both Br” and Br* than are those of DBr. 
The differences are, however, within the experimental 
error and may not be significant. 

The earlier work on DBr combined with the present 
measurement on TBr allow an evaluation of the mass 
ratio of tritium and deuterium. The reduced mass 
ratios are inversely proportional to the B, values, 
from which the mass ratio, 


mr/mp = 1.49747, 


is readily derived. The greatest error in this evaluation 
arises from the a, values obtained from infrared data. 
Nevertheless, the good agreement with the ratio, 


mr/mp = 1.497462, 


from other sources indicate that the infrared values of 
a, are very good. Two sources of a, are available, 
one from the HBr results of Thompson, Williams, and 
Callomon” and the other from the DBr results of 
Keller and Nielsen." Values from the two sources are 


13 (1982: son, Williams, and Callomon, Spectrochimica Acta 5, 
311 (1952). 
3, L. Keller and A. H. Nielsen, Phys. Rev. 91, 235 (1953). 
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in good agreement. The mass of tritium, so far as we 
know, has not been measured directly, but it has been 
calculated with high accuracy from nuclear reaction 
energies. The mp=2.014741(+3) value which we used 
here is given by Ogata and Matsuda,“ and the mr 
=3.016997(+11) is that given by Whaling, Fowler, 
and Lauritsen.!® This value mp with our mass ratio 


4K. Ogata and H. Matsuda, Phys. Rev. 89, 27 (1953). 
18 Whaling, Fowler, and Lauritsen, Phys. Rev. 83, 512 (1951). 
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leads to the spectroscopy value mr=3.01700; for the 
tritium mass. 

After our measurements on TBr were complete we 
received the Progress Report from the Columbia 
Radiation Laboratory dated October 30, 1954 which 
states that A. H. Nethercot and B. Rosenblum" have 
also made measurements on the J=0-—>1 transition 
of TBr. 


18 Note added in proof.—The work of Nethercot and Rosenblum 
has now been published [Phys. Rev. 97, 84 (1955). 
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Fermi Resonance in the Microwave Spectrum of Linear XYZ Molecules* 


W. Low 
Department of Physics, The Hebrew University of Jerusalem, Jerusalem, Israel 
(Received November 15, 1954) 


Measurements of the rotation-interaction constants a; for the various vibrational states in the microwave 
spectrum of OCS and OCSe show little consistency. These discrepancies are explained within experimental 
accuracy as being due to interaction between adjacent vibrational levels with quantum numbers 1, 22, 03 
and v,—1, v2+2, v3. Resonance of this type, called Fermi resonance, has also been found in ICN but the 


interaction constant is very much smaller. 





HE rotational frequency for any vibrational state 
of a linear XYZ molecule is given to a first 
approximation by 


¥7-157=2JB,—4D,(F'—PJ), 
B,= B.—a1(01+3) —a2(ve+ 1)—a3(v3+3), 


where B, is the rotational constant, assuming the nuclei 
to be in their equilibrium positions, and is inversely 
proportional to the moment of inertia of the molecule; 
D, is the centrifugal distortion coefficient in the vibra- 
tional state v; J is the quantum number of the total 
angular momentum; 2; the quantum number of the ith 
vibrational mode; and q is the /-type doubling constant. 

In this type of molecule, the centrifugal distortion is 
usually small compared to the effects to be considered 
and can be neglected. The second vibrational mode 22, 
the bending mode, is degenerate. Owing to vibration- 
rotation interaction this degeneracy is lifted and is 
split into two levels, called /-type doublet, and is 
designated by v2’! and 22, respectively. The /-type 
splitting is appreciable only when /=1 and is given by 
Av=2qJ.1 The /-type coupling constant varies slightly 
with J by an amount of the order of g(B/w)J(J+1)? 

The rotation-vibration constants a; depend in a 
complex way on the potential function, on the moment 


(1) 


* This work was supported jointly by the U. S. Signal Corps 
and the Office of Naval Research. 

1H. H. Nielsen, Phys. Rev. 75, 1961 (1949). 

2C. H. Townes and A. L. Schawlow, Microwave Spectroscopy 
(McGraw-Hill Book Company, Inc., 1955), Chap. II has a 
detailed discussion on vibration-rotation interaction. 


of inertia, and on the vibrational frequencies w of the 
molecule. Experimentally, one can determine the values 
of a by measuring the frequency difference between the 
ground state and the excited vibrational states or 
between adjacent vibrational states. Thus, for example, 
the frequency difference between the ground state 
(000) and the center of the /-type doublet excited 
vibrational state (010) in the /= 1-2 transition should 
equal 4a». 

It has been the hope of microwave spectroscopists 
that rather exact nuclear mass ratios could be deter- 
mined from the spectra of linear XYZ molecules with 
various isotopic substitutions by measuring the rota- 
tional frequencies and the values of the rotation- 
vibration constants to obtain the equilibrium moments 
of inertia of the various isotopic molecules. However, 
discrepancies have appeared in the measurement of the 
a’s from various vibrational states.* These discrepancies 
are illustrated in Table I for OCS. Frequency shifts as 
high as 17 Mc/sec are found for some vibrational states 
in the J=1—2 transition and similar shifts in the 
J=2- 3 transition. The (02°0) line, moreover, does not 
coincide with the center of the unsplit doublet (0270), 
indicating that the perturbations depend not only on 
the energy of the vibrations but on their symmetry as 
well. Similar discrepancies have been observed for 
OCSe and smaller ones for ICN. 

These anomalies may be explained as perturbations 


3 These discrepancies were reported for OCS by R. G. Shulman 
and C. H. Townes, Phys. Rev. 75, 1318(A) (1949). 
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between nearly degenerate vibrational states of the 
same symmetry. Such perturbations between vibra- 
tional states have been first recognized by Fermi‘ in 
the spectrum of CO, and are usually called Fermi 
resonance. The repulsion between adjacent levels due 
to Fermi resonance changes the effective B value and 
destroys the expected regularity as predicted by Eq. 
(1). Figure 1 illustrates the effects of resonance on the 
rotational levels of OCS in the J=2—3 transition.® 
The arrows indicate the effect of the perturbations. 
By allowing for these perturbations consistent values 
of the a’s can be obtained. 


THEORY AND RESULTS 
Effect of Fermi Resonance on Vibrational Levels 


Herzberg has given an elementary theory of pertur- 
bation between vibrational levels.* This theory will be 
adapted for the linear XYZ molecule and its effect on 
the rotational spectrum. If there are m resonating 
levels on the same symmetry the perturbed energy 
levels W are given by first-order perturbation theory 
by the solution of the mth order determinant: 


WY-—W Wa Wa Wa 
Wr WY-W Ws W ne 
—_ Saye ois : =0, (2) 
Win Wen Won W anvv—W 


where W,° are the unperturbed energy levels; 


Wau= five Vy ddr ; wee 


is the zero approximation of the wave function for the 
vibrational level x. They must be of the same symmetry 
type, and therefore must be states of the same angular 
momentum / about the molecular axis to have matrix 
elements W,,; different from zero. The perturbation 
interaction is given by the anharmonic terms of the 
potential function, which, in the case of the linear 
XYZ molecule, can be written as 


2V= Riiigi?+ Riisqi’qst 1229192" + Ri239193" 
+ hoo3q2"¢s-+ksssq3°+ quartic terms. (3) 


The constants k;;, are force constants and the q; are 
the normal coordinates. 

Examples of Fermi resonance between adjacent levels 
of the same symmetry are shown in Fig. 2. Thus, for 
example, the two levels (02°0) and (100) perturb each 
other; similarly, the three levels (04°0), (12°0), and 
(001). 

If only two levels perturb each other, the difference 


4E. Fermi, Z. Physik 71, 250 (1931). 

5 Figure 1 is taken from reference 2 with the kind permission 
of Professor Townes. 

®G. Herzberg, Infrared and Raman Spectra of Polyatomic 
Molecules (D. Van Nostrand Company, Inc., New York, 1945). 
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Fic. 1. Rotational transition J=2—3 of OCS. The arrows 
indicate the effect of Fermi resonance perturbation in shifting 
the rotational frequencies. The vibrational state is given by 
quantum numbers (v1,%2,23) in brackets, v2 having a superscript 
|/|. In the case |/| =1, the subscript 1 refers to the lower-frequency 
component and 2 to the higher-frequency component of the 
doublet. Intensities are calculated for approximately 800°C. 








in energy between the perturbed levels is given by 
AW = (4| W12|?+62)!, (4) 


where 6=W,°—W; is the separation between the 

unperturbed levels. In the special case of interest when 

the two levels with quantum numbers (2),22!"!,v3) and 

(v:—1, vo+2!4!, v3) resonate, the interaction potential 

W can be evaluated: 

W ow, vel tl, oa3 1 —1, v2-+2! 4, 93) 
=n 

= ———_—Ah00'[ (v2 +2)?— PF} (5) 

16V2 2 clus hws 


where w1, we, ws are the fundamental frequencies of the 
XYZ molecule. It is noteworthy that the center of 
gravity of the perturbed and unperturbed levels 
coincide. 

The eigenfunctions of the new states are a mixture 
of the unperturbed eigenfunctions and are given by 


Vi=apy— bp”, po=bpi+ay,”, 
e 
[4] Wislttore+e 1 5 
a= = + ’ 
2[4| Wr2|?+6? }! 2 2[4| Wi2|2+68?| 3 
P+=1, 


wher 





(6) 


an 


In the case of resonance, the B values for the various 
vibrational states as determined from Eq. (1) refer 
now to the perturbed value. The unperturbed B values 
can be found as follows. Let B,, Bo, ---B, refer to the 
measured rotation levels which resonate, and B,°, B,’, 
--+B,° the unperturbed values in absence of resonance; 
then 

B;=a7By+6b7B!+c¢?BS+ : - n 


O+P+e+---=1, (7) 
Bnd Be 
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Fic. 2. Vibrational states in OCS. Arrows indicate possible 
interaction between vibrational levels. 





where the summation is carried out over all levels. In 
the case of interest of the two resonating levels (01,v2,3) 
and (v:+1, v+2, v2), a knowledge of the two experi- 
mental values of By, and By,-1,».+2,0) and the 
unperturbed value of B,o1o)° is necessary to determine 
the unperturbed values of a; and a2. From Eqs. (4) 
and (7) and the separation AW between the vibrational 
levels (known from infrared data) the unperturbed 
separation 6 and the interaction constant Wi. can be 
evaluated. 


OcsS 


Figure 2 shows the energy levels of the various 
vibrations and the arrows indicate the possible inter- 
actions which might cause Fermi resonance. Table I 
lists the observed frequencies for J= 1-2 and J=2—3 
transitions (interacting levels are bracketed). The 
unperturbed value of a2 is found from the separation 
between the rotational levels of states (000) and (0110). 
The frequency shift of the (02°0) level due to Fermi 
resonance may be obtained from the known value of ae. 
The unperturbed value of a: is now determined since 
the frequency change of the (100) state must be equal 
and opposite to that of (020). From Eq. (7), one finds 
a’=0.9434 and b?=0.0566. The perturbed separation 
between the vibrational levels has been measured by 
Bartunek and Barker.’ (The /=0 and /=2 transitions 
are interchanged in their paper.) Using their results, 
one finds for the interaction energy W12=43.2 cm™ 
and the unperturbed separation 65=166.6 cm™. The 
more recent measurements by Callomon eé¢ al.® give the 
separation AW as 188.8 cm, and one finds 6=167.4 
cm! and W3.=43.6 cm—. The correction factors due 
to Fermi resonance are now calculated using the latter 


7p. F. Bartunek and E. F. Barker, Phys. Rev. 48, 518 ie als 
8 Callomon, McKean, and Thompson, Proc. Roy. Soc. (London) 
A208, 341 (1951). 
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values and Eq. (5), and reversing the process. They 
are listed in column 4. Some small discrepancies remain, 
possibly because Eq. (1) is a first-order approximation 
and, therefore, varies slightly with the vibrational 
state. Slightly improved values are obtained using 
171.6 cm™ and 45.8 cm“. They are given in parentheses 
in column 4. 


OCSe*® 


In OCSe®, a shift of 4.22 Mc/sec due to Fermi 
resonance was observed as indicated in Table IT. Using 
vibrational frequencies? of OCSe, w:=642+4, w2=466 
+4, w3=2022+4 (all in units of cm~), one obtains 
the unperturbed values a1;= 14.01 Mc/sec, a2= —6.88 
Mc/sec, a?=0.975, 6?=0.025, Wi2=45 cm™ and 6= 275 
cm"), 


ICN 


The spectrum of ICN is complicated by quadrupole 
interaction. The excited states ».=2, /=2, and »=2, 
1=0 are separated by quadrupole effects. After cor- 
recting for second-order corrections for quadrupole 


TABLE I. Fermi resonance perturbations in the 
rotational spectrum of OCS. 








Correction due to 





Rota- Vibrational Fermi resonance 
tional state Frequency perturbation Intensity 
transition 1 ve!!! ys Mc/sec c/sec* cm=! 
J=1>2 00 0 = 24325.92> 0 5.5X1075 
0140 24355.50> 0 4.4X10-* 
01%0 24381.07 4.410 
10 0: 24253.51° — 9.52 (— 9.85) 8.71077 
022 0 24401.0¢ + 9.52( 9.85) 3.210’ 
20 0 24179.622 —164 (-—17.5) 1.3 x10 
12° 0 +164 ( 17.5 ) 
1140 24289.97 —162 (—17.15) 6.2x10-° 
1120 24316.76 6.2X10-° 
0340 244114 2.8X 107° 
0340 24459 +162 ( 17.15) a'3Vi9-8 
J=2-3 00 0 36488.82° 0 1.5X10-4 
0140 36532.47 0 1.31075 
0120 36570.83 1.3X10-> 
02 0 36615.3 0 2.2X10-* 
022 0 3660081 —14.28 (—14.83) 1.1*10-° 
10° 0 +14.28 ( 14.83) 








® Calculated with the values 6=167.4 cm=!, Wiz =43.6 cm=. Values in 
arentheses were calculated with 6=171.6 cm and Wi2=45.8 cm™. 
nteracting levels are bracketed. 
b Townes, Holden, and Merritt, Phys. or, 74, 1113 (1947). 
© Bianco and Roberts (private communica‘ ion). 
4 R. G. Shulman, Thesis, Columbia University, 1949 Goeeenene. 
_eeee Wentink, and Kyhl, Phys. Rev. 75, 270 (1949). 


, “*Weems a ateful. to Professor R. F. Lord for informing us of 
the conaaiied vibration frequencies of OCSe. 
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effects, a small discrepancy of 0.35 Mc/sec is found 
between these lines. This yields a value of Wi2=3 cm, 
§=171 cm™, a;=9.33 Mc/sec, and a2= —6.88 Mc/sec. 

Tetenbaum™ has made a detailed analysis of the 
spectrum of BrCN. Unlike our value of the interaction 
constant Wy. for ICN he finds a much larger value of 
61.5 cm™ for BrCN. This large difference is somewhat 
puzzling. It is intended to make measurements on the 
higher vibrational states of CICN and to compare it 
with these two molecules. 


TABLE IT. Fermi resonance perturbations in the rotational 
spectrum of OCSe® and ICN. 








Correction due 





Vibrational to Fermi res. 
state Frequency _ perturbation Intensity 
Molecule v1 valtl ys Mc/sec Mc/sec em} 
OCSe® 00 0 24 105.85* 0 6.8X10-* 
J=2-3 — 
0140 24 137.80 0 7.2X1077 
0120 24 156.46 0 7.2X1077 
( 0 0 24 026.26» —4.22 3.11077 
0 2° 0 24 183.97 +4.22 8.1 10-8 
022 0 24 188.18 0 1.6X1077 
ICN 00 0 25 823.08° 0 7.51075 
J=3-4 
E;=11/2- 10 0 25 748.18¢° —0.35 70X%10"* 
F,=13/2 
02° 0 25 979.72 +0.35 1.7X10-* 
0 2? 0 26 046.32 0 3.5X107* 








® Geschwind, Minden, and Townes, Phys. Rev. 78, 174 (1950). 

> Strandberg, Wentink, and Hill, Phys. Rev. 75, 827 (1949). Frequency 
corrected by 0.13 Mc/sec so that the ground state should coincide with 
that in reference a. 

© Townes, Holden, and Merritt, Phys. Rev. 74, 1113 (1948). 


10 J. Bardeen and C. H. Townes, Phys. Rev. 73, 627 (1948). 
10a S. J. Tetenbaum, Phys. Rev. 86, 440 (1952). 
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TABLE III. Summary of results. 











Vibrational frequencies —_ Interaction Rotation-vibration 
Mole- em=1 constant interaction in Mc/sec 
cule m1 v2 vs cm7} ai az as 
CO 1342.8 667.3 2349.3 51.2 13.2 —216 92.1 
OCS 865 524 20665 43.6(45.8) 20.5 —10.59 
OCSe 642 466 2022° 45 14.0 — 6.88 








® Taylor, Benedict, and Strong, J. Chem. Phys. 20, 1884 (1952) ; Benedict, 
Herman, and Silverman, J. Chem. Phys. 19, 1325 (1951). 

b See reference 8. 

¢ Professor F. C. Lord (private communication). 


Table III summarizes these results. It is interesting 
to note the similarity in values of Wi2 for CO2, OCS, 
and OCSe. 

Because of these types of perturbations in polyatomic 
molecules it will be very difficult to determine the 
equilibrium moments of inertia for several isotopic 
species with sufficient accuracy to take full advantage 
of the accuracy of microwave measurements for nuclear 
mass determinations. In some cases, allowance can be 
made for Fermi resonance. But in others, as in the 
determination of the third vibrational mode, it may be 
very difficult since the level may be perturbed by 
several adjacent levels as seen from Fig. 2. On the 
other hand, isotopic mass ratios can be determined 
with good accuracy from the effective rotational con- 
stants of polyatomic molecules if two isotopic masses 
are known." 

This work was carried out at the Department of 
Physics, Columbia University, in 1949, and was re- 
ported at the American Physical Society meeting at 
Washington in 1950." The author would like to thank 
Professor C. H. Townes for the constant advice and 
help which made this work possible. 


11 Townes,’ Holden, and Merritt, Phys. Rev. 74, 1113 (1948). 
12 W. Low and C. H. Townes, Phys. Rev. 79, 224(A) (1950). 
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Ionization in Pure Gases and the Average Energy to Make an Ion Pair 
for Alpha and Beta Particles 


Wittram P. JESSE AND JOHN SADAUSKIS 
Argonne National Laboratory, Lemont, Illinois 
(Received November 26, 1954) 


A series of measurements has been made of the relative currents produced in different gases by beta 
particles from Ni® and from tritium sources in an ionization chamber. In all cases only relative current 
measurements with argon as a standard gas have as yet been made. The value of W, the average energy to 
make an ion-pair computed relative to argon as a standard, is found to be the same for Ni® and tritium 
sources. If these relative Wg values are plotted as abscissas against previously determined W. values for 
polonium alpha particles as ordinates, a marked difference is observed in the gases investigated. For hydrogen 
and the noble gases the plotted points lie closely on a 45 degree straight line through the origin. Thus for 
these gases the ratio W./Wg is constant. This constant may well be unity, but this is not proved as yet by 
these results. For all other gases so far investigated, the plotted points lie above the 45 degree line, indicating 
a higher efficiency of ionization (and a lower W) for the beta particles than for the polonium alpha particles. 
These results are in accord with the findings of Gray and Gurney. Gurney’s results have been extended here 
to include a greater variety of gases for reduced alpha particles of approximately 1-Mev energy. Two 
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. postulates are advanced to explain the behavior of W. and Wg here found. 





XPERIMENTS have been in progress for some 
time in which the relative ionization for different 
gases has been measured for beta particles emitted from 
tritium and Ni® sources. Although the results are as 
yet preliminary, a comparison of them with the corre- 
sponding alpha-particle values is of interest and would 
seem to warrant publication at this time. 


EXPERIMENTAL METHODS AND RESULTS 


The ionization chamber used was a cylindrical one of 
brass of inside diameter 9.5 cm and height 7 cm (Fig. 1). 
The collecting electrode at the center of the cylinder 
was in the form of a wire ring 5 cm in diameter, sup- 
ported by three wire stays rising from the central 
insulated shaft. In the plane of the collecting ring and 
filling the interior was a square gridwork of copper 
wires spaced 6 mm apart in each direction. In some 
experiments the individual wires were 2 mils in diameter 
and in others 6 mils. The wires in the central area of the 
collecting disk were coated with the beta-emitting sub- 
stance under investigation. For Ni® this coating was 
accomplished by electroplating. The tritium, in the form 
of a solution of tritiated polystyrene, was applied with 
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Fic. 1. Schematic diagram of ionization chamber for 
beta-particle current measurements. 


a brush to the gridwork and was made conducting when 
dry by a light application of graphite—soft lead pencil. 

The employment of the gridded electrode came as the 
result of a series of experiments where the source was 
applied to the extended surface of a plate electrode. 
The expected ionization seemed to be reduced by what 
we believe to be a process of backscattering of the very 
soft beta particles from the molecules of the gas back 
to the plate. The higher the atomic number of the gas 
the greater this effect seemed to be. With the gridded 
electrode the effective area for interception of such 
back-scattered particles is markedly reduced, and the 
effect is essentially eliminated. 

With each of the beta sources in the chamber a series 
of measurements was made with the gases under investi- 
gation. The general arrangement of the apparatus was 
much the same as in previous experiments,! where the 
ionization chamber was coupled to a vibrating-reed 
electrometer, which in turn fed into a Brown strip- 
chart recorder. In the present experiments, however, 
the ionization current for each gas was measured by 
means of a null method. Here a counter-potential 
opposing the natural drift was continuously built up 
upon the floating vibrating-reed system from an ex- 
ternal potentiometer, and the time between successive 
passages of the Brown recorder needle through a 
fiducial mark was determined. The ratios of the currents 
for the different gases was determined with argon as an 
arbitrarily chosen standard. The ratio of the values 
of W, the average energy to make an ion pair, is, of 
course, the reciprocal of this current ratio. 

It should be noted that the current measured is an 
integrated effect from a large number of beta particles 
of varying energy. Even the average energy of the 
beta-particle beam is not exactly known, when the 
effects of self-absorption in the sample are considered. 
A rough estimate of the average energy would seem to 


! Jesse, Forstat, and Sadauskis, Phys. Rev. 77, 782 (1950). 
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IONIZATION 


TABLE I. Comparison of W values in ev/ion pair 
for alpha and beta particles. 











We We a 
We Ni® We Ni® tritium tritium We 
2-mil 6-mil 6-mil and A’? W,»for for re- 

Gas grid-wire grid-wire grid-wire (Valentine) Poa duced a 
He 42.3 42.7 42.4 
Ne 36.6 36.8 37.4 
A (26.4) (26.4) (26.4) (26.4) 26.4 (26.4) 
Kr 24.1 24.1 24.2 24.1 24.1 
Xe 21.8 22.2 21.9 
He 36.3 37.2 36.3 
Air 34.3 34.0 33.9 34.1 35.5 37.1 
Ne 34.7 34.8 36.6 38.1 
O2 30.9 31.4 32.5 
CO:z 33.2 32.9 32.8 34.5 36.3 
CoHa 26.0 26.5 28.0 29.8 
CoHe 24.9 24.6 26.6 28.5 
CHa 27.3 29.3 29.2 31.0 
C:He 26.1 27.5 29.0 








a J. M. Valentine, Proc. Roy. Soc. (London) A211, 75 (1952). 
bW. P. Jesse and J. Sadauskis, Phys. Rev. 90, 1120 (1953). 


be from 3 to 5 kev for the tritium sample and from 15 
to 20 kev for the Ni® sample. 

Throughout the experiment extreme precautions were 
taken to insure the purity of the gases used, since, 
especially in the noble gases, the presence of minute 
impurities has been shown to affect greatly the measured 
ionization. Hence, the noble gas measurements were 
made with the gas in continuous circulation through an 
appropriate purification system. 

In Table I a comparison is made of the average 
energy to make an ion pair for beta particles and for 
alpha particles in the various gases used. With the 
exception of the results in column 6, which give absolute 
values of W for polonium alpha particles already deter- 
mined in this laboratory,* all the W values in Table I 
are computed relative to argon as a standard gas. For 
convenience, in each vertical column the argon W value 
is arbitrarily assumed to be 26.4 ev/ion pair, the 
absolute value of W determined for polonium alpha 
particles. The W values for all other gases in the column 
are computed on this basis from the ratios measured 
relative to argon. 

Three series of beta-ionization measurements are 
shown in vertical columns 2, 3, and 4 of Table I. Two of 
these are for Ni® plated on 6-mil and 2-mil copper 
wires, respectively, and the third is for tritiated poly- 
styrene on 6-mil copper wires. No important differences 
exceeding the experimental error can be observed in the 
results for the three series. In particular, no change in W 
can be observed between the results from the dis- 
integration of Ni® of estimated average beta energy 15 
to 20 kev and tritium of energy 3 to 5 kev. 

In vertical column 5 are given for comparison the 
results of Valentine,‘ who used as active sources tritium 
and A*’. These results are again relative to argon as a 
standard gas. With the exception of the last value for 
CH, the agreement with the present results is satis- 
factory. 

2W. P. Jesse and J. ing, Phys. Rev. 88, 417 (1952). 


3 See reference b of Table I 
4 See reference a of Table IL. 
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In the graph of Fig. 2 a comparison is made of the 
average energy to make an ion pair for beta particles 
and for alpha particles in the various gases used. As 
ordinates are plotted the absolute W values for Po 
alpha particles taken from column 6, Table I. As ab- 
scissas are plotted the mean of the Wg values of columns 
2, 3, and 4. The plotted points on the graph are desig- 
nated by large circles. 

It will be noted from Fig. 2 that the gases investigated 
seem to fall into two groups if classified according to 
their W values. In the first group, comprising the noble 
gases and H:, the plotted points lie very closely upon 
a 45 degree line drawn from the origin through the 
somewhat arbitrarily chosen point for argon. This 
would indicate that for all these gases 


W../Ws=constant. 


This constant may well be unity, but so far this has 
not been directly proved in these results, since up to 
this time no absolute beta-particle measurements have 
been obtained. 

In the other group of gases, comprising air, O2, No, 
CO:2, and the hydrocarbon gases, the experimental 
points lie definitely above the 45 degree line. This would 
indicate that, relative to argon, the Wg values for these 
gases are lower than the corresponding W,. values. In 
other words, in these gases the beta particles ionize 
more efficiently than do the polonium alpha particles. 

In the case of air, the lower W for beta particles 
might be partly explained by better voltage saturation 
in the chamber, since the difficulties of drawing ions 
out of the dense alpha-particle tracks are well known. 
It is doubtful, however, whether this explanation could 
apply to the whole group of gases including Nz and the 
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Fic. 3. Schematic diagram of ionization chamber for measure- 
ment of reduced alpha particles, showing collimating system and 
absorbing window. 


hydrocarbons, where the saturation difficulties are 
much less severe. 

It is of interest to note that the group of gases for 
which W/W, is here constant, that is, the noble gases 
and Hp, is just the group of gases for which the relative 
W.. values were found by Gray® to be independent of 
alpha-particle energy. On the other hand, for air, No, 
and O: a variation of W. with alpha-particle energy 
was found, W, being higher for lower energy ranges. 
These gases would seem to correspond to the second 
group above. 

Since the experimental work of Gurney,*® on which the 
conclusions of Gray are based, did not include many of 
the gases investigated above, especially CO. and the 
hydrocarbons, it was decided to repeat the experiment 
of Gurney. 

Alpha particles from an Am™! source deposited on a 
platinum disk were collimated and allowed to pass from 
an evacuated region through a mica window into the 
ionization chamber (Fig. 3). The mica was chosen of 
such a thickness that the maximum energy of the 
emerging alpha particles was of the order of 1.2 Mev. 
The initial alpha-particle beam was not, however, 
strictly monoenergetic, since because of its divergence 
particles passing through the mica sheet at other than 
normal incidence suffered a greater loss of energy. In 
these experiments with alpha particles of reduced 
energy, as in the case of beta particles, all measure- 
ments were made relative to argon as a standard. 

The data from these measurements are shown in the 
last vertical column of Table I. In Fig. 2, the W values 
for these reduced alpha particles are plotted as ordinates 
against the corresponding beta values. The plotted 
points are indicated by heavy crosses. 

Here again the points for the noble gases fall reason- 
ably well along the 45 degree line, while those for air, 


5 L. H. Gray, Proc. Cambridge Phil. Soc. 40, 95 (1944). 
6 R. W. Gurney, Proc. Roy. Soc. (London) A107, 332 (1925). 
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Ne, COs, and the hydrocarbons lie even further above 
the line than do those for the polonium alpha particles. 
This shows a relative increase in the W, values with 
decreasing energy in accord with the findings of Gray. 
In the present work CO, and the hydrocarbons are 
shown to lie in the general class with air, O2, and N2 
already cited by Gray. Parenthetically, it may be 
noted that the ratio W./Wg is slightly larger for the 
hydrocarbons as a class than for air, No, O2, and COs, 
both for the polonium and reduced alpha particles. 


CONCLUSIONS 


It would seem that the data here presented fall into 
accord on the basis of the two following postulates. 
Neither of them is as yet proved conclusively, but there 
is much experimental evidence to support them in 
addition to that cited here. 

1. If, in the relation W./W,=constant for hydrogen 
and the noble gases, we assume this constant to be 
unity, then W in these gases is the same both for beta 
and alpha particles throughout all ranges of energy for 
either so far measured by us. 

2. In all other gases so far measured, W,/Wg is not 
found to be constant. The variation in the ratio seems 
to come from a variation of W with alpha-particle 
energy rather than a variation of Wg with beta-particle 
energy. As the alpha-particle energy increases, this 
ratio probably approaches a constant value, which 
again is probably unity. 

The following experiments would seem desirable to 
prove or disprove the postulates above. 

1. To determine the relative W values for gases of 
Class 2 for alpha particles over a small energy increment 
in the neighborhood of 5 Mev or higher. These values 
should approach the Ws values if postulate 2 is correct. 

2. So far only beta particles of very low energies have 
been investigated and these over a very small range of 
average energy. Plans are in progress to repeat the 
measurements with beta particles from C'*.* 

3. In the relation W./W,s=constant, one cannot be 
certain that the constant above is unity without an 
absolute determination of Ws. This is not easy, but 
already preparations are being made for such a determi- 
nation. 

It is a pleasure to express our thanks to a large 
number of friends and colleagues for the many stimu- 
lating discussions of this subject. Among these are 
Professor Robert L. Platzman, Dr. Francis R. Shonka, 
Dr. John E. Rose, and Dr. L. D. Marinelli. Our thanks 
are due also Mr. L. C. Ellsworth, whose painstaking 
skill in the construction of our ion chambers has added 
greatly to the success of this work. 

* Note added in proof—Recent similar experiments with beta 


particles from C' indicate the same constancy of W with energy 
found above for Ni® and tritium. 
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Cross sections are calculated for rotational excitation of a homonuclear diatomic molecule by collisions 
with very slow electrons. The mechanism is assumed to be the long-range quadrupole interaction. The 
Born approximation is shown to be correct in the low-energy limit. The results are applied to calculation 
of energy losses in No, and comparison is made with values inferred from swarm and cross-modulation 
experiments. At energies below 0.29 ev, the threshold for vibrational excitation, losses are ~ twice the 
experimental values, but many times larger than the value (2m/M) for elastic losses only. 





I. INTRODUCTION 


N electron making elastic collisions only as it 
moves through a gas is expected to lose energy 
at a rate ~(2m/M)e, per collision,! where m is the 
electron mass, M the molecular mass, and the electron 
energy €, is large compared to the mean thermal energy 
of the molecules. Measurements by swarm experiments” 
of the rate of energy loss in the noble gases He, Ne, A 
agree with this theoretical expectation at average elec- 
tron energies well below the first excitation potential. 
In molecular gases, on the other hand, the reported 
energy losses’? per collision considerably exceed 
(2m/M)éa, at average electron energies far below the 
electronic excitation threshold. In N2 in particular this 
excessive energy loss is confirmed by recent experiments 
in the laboratory*‘ and in the ionosphere.!: 

These results obviously suggest that in molecular 
gases electrons too slow to cause electronic jumps 
mainly lose energy by rotational and vibrational exci- 
tation. Massey® has calculated the cross section for 
rotational excitation in molecules such as HCl which 
possess permanent electric dipole moments; he finds 
the cross section is quite large. In homonuclear diatomic 
molecules, however, which nave no electric dipole 
moments, Morse’ estimated the energy loss by rota- 
tional excitation to be of the order of the elastic loss. 
If this estimate is correct, it is difficult to account for 
the reported losses at average electron energies much 
less than the vibrational excitation threshold, under 
which circumstance vibrational excitation is presumably 


* This research was supported in re by the United States 
Air Force under a contract monitored by the Office of Scientific 
Research, Air Research and Development Command. Part of a 
thesis submitted by one of us (S. S.) in partial fulfillment of the 
requirements for the degree of Doctor of Philosophy. 
National Science Foundation Predoctoral Fellow. 

1H. S. W. Massey and E. H. S. Burhop, Electronic and Ionic 
— (Clarendon Press, Oxford, 1952), pp. 15 and 

?R. H. Healey and J. W. Reed, The Behaviour of Slow Electrons 
in Gases (Wireless Press, Sydney, 1941), pp. 87-102. 

*R. W. Crompton and D. J. Sutton, Proc. Roy. Soc. (London) 
A215, 467 (1952). 

4L. Goldstein (private communication). : 

5L. G. H. Huxley, Proceedings of the Conference on Ionospheric 
Physics (July, 1950) Part A (Geophysics Research Division, 
Air Force Cambridge Research Center, 1952), p. 149. 

6H. S. W. Massey, Proc. Cambridge Phil. Soc. 28, 99 (1932). 

7™P. M. Morse, Phys. Rev. 90, 51 (1953). 


negligible. In view of the experimental complications,® 
and the numerous uncertainties in the interpretation 
of the swarm experiments,’ it cannot be inferred that 
Morse’s estimate of the rotational excitation loss is 
incorrect. Nonetheless the situation is not satisfactory, 
and it appears desirable to re-examine the theoretical 
probability of rotational excitation in homonuclear 
molecules. 

The result that the cross section for rotational 
excitation by slow electrons is small may be understood 
as follows. To conserve total angular momentum when 
the molecule makes a rotational transition, the electron 
must have some orbital angular momentum either 
before or after the collision, i.e., it cannot both go in 
and come out as an s-electron. But at these large 
wavelengths only s-electrons have an appreciable prob- 
ability of being found ip the vicinity of the molecule. 
In other words, the electrons possessing the angular 
momentum to cause rotational transitions are neces- 
sarily far from the molecule, interact only weakly with 
it, and the cross section for rotational excitation is 
small. This argument, though appealing, proves to be 
specious® for molecules possessing a dipole moment 
because the electron-dipole interaction potential, falling 
off as r~’, is sufficiently strong at long range to permit 
appreciable interaction with electrons of />0. 

Homonuclear diatomic molecules generally have elec- 
tric quadrupole moments, so that their interaction with 
electrons also has a long-range tail, in this case falling 
off as r~*, In the following section we use a multipole 
expansion of the molecular field to calculate the cross 
sections for rotational excitation and de-excitation of 
homonuclear diatomic molecules by slow electrons. 
The cross sections, proportional to the square of the 
quadrupole moment, are smaller than in polar mole- 
cules. Nonetheless, in Ne, at electron energies well 
below the vibrational threshold, the predicted energy 
loss by rotational excitation is much larger than the 
elastic loss and in fact is of the order of magnitude of 
the observed loss.*-4 


8 Crompton, Huxley, and Sutton, Proc. Roy. Soc. (London) 
A218, 507 (1953); L. G. H. Huxley and A. A. Zaazou, Proc. 
Roy. Soc. (London) A196, 402 (1949). 

*W. P. Allis and H. W. Allen, Phys. Rev. 52, 703 (1937). 
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In our calculation, as in previous calculations®”.." 
of rotational and vibrational excitation probabilities, 
Born approximation has been employed despite the 
fact that the electrons are very slow. Our use of this 
approximation is defended in the final section of this 
paper. We argue that the principal contribution to the 
cross section comes from large distances of the incoming 
electrons from the molecule, where the wave function 
is only slightly distorted from its incident form. This 
argument, which is related to one given by Massey,® 
leads to the inference that at low energies, because the 
effective region of interaction is increasingly distant 
from the molecule, the Born approximation probably 
improves with decreasing incident electron energy. 

We support and justify our argument by evaluating 
the second Born approximation to the scattering ampli- 
tude for a pure quadrupole interaction. We find that 
the ratio of the second to first Born amplitudes ap- 
proaches unity with decreasing incident electron energy. 
Born approximation is much harder to justify when 
the principal contributions to the cross section come 
from small distances, as is the case when the quadrupole 
field of the homonuclear molecule is neglected. The 
oft-employed assumption that the charge distribution 
of the molecule is the sum of two spherically symmetric 
parts, each centered about a nucleus, neglects the 
quadrupole field, since the potential of such a charge 
distribution vanishes exponentially at infinity. These 
remarks amount to a criticism of the use of Born 
approximation in some previous work, and help account 
for the fact that we predict a larger energy loss by 
rotational excitation than does Morse.’ 

It is our conclusion that in Ne at least, at electron 
energies below the vibrational threshold, losses signifi- 
cantly exceeding the elastic value are consistent with 
theory. Our predicted losses are in qualitative agree- 
ment with the reported values; as explained in Secs. 
III and IV, without additional experimental and theo- 
retical work, a more detailed comparison of our pre- 
dictions with the experiments would not be meaningful. 
Evidently it would be desirable to have more direct 
experimental evidence of rotational excitation. A pos- 
sible means of accomplishing this is described in an 
accompanying paper” on Hp». 


Il. FORMULATION 
We seek a solution Y of the Schrédinger equation: 
(H—E)¥=0, (1) 
where 
H=H,— (h?/2m)A,+V, (2) 


10H. S. W. Massey, Trans. Faraday Soc. 31, 556 (1935). 

1 T, Y. Wu, Phys. Rev. 71, 111 (1947). 

2 E. Gerjuoy and S. Stein (to be published). In He, the only 
other gas on which there are recent data below the vibrational 
threshold, the observed losses exceed the expected elastic loss by 
a — —— factor than in Ne. See Crompton and Sutton, 
reference 3, 
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with 
V=—2; 2,/|r—1;]. (3) 


In the above, r is the coordinate of the incident electron, 
Ho is the Hamiltonian of the isolated molecule, and the 
subscript 7 indexes the particles, electrons, and nuclei, 
composing the molecule. The perturbing Coulomb 
interaction V is summed over the coordinates r; of all 
particles in the molecule. The charge Z; is —1 for 
electrons and is Z=N/2 for either nucleus, with NV the 
total number of electrons in the molecule. The center 
of mass of the entire system, molecule plus incident 
electron, is the origin of coordinates. It may be assumed 
to coincide with the center of mass of the isolated 
molecule, since the incident electron mass m is so small. 
The scattering amplitude A.s(n,mo) for the transition 
from the initial molecular state g, with electron incident 
along mp to the final molecular state g, with electron 
outgoing along n is” 


A qo(n,no) = — (m/2rh?) f drdr; exp[—ikpn-r ] 


X go*(rj)V (t,rj)Wa(t,r;). (4) 


Here ¥, is a solution of Eq. (1) which satisfies the 
usual boundary conditions, i.e., outgoing at infinity 
except for its incident part ga(rj) exp[ikamo-r]. The 
integral (and implied spin sum) in Eq. (4) is over the 
coordinates of the initial electron and of all particles j 
in the molecule. Since V is independent of spin, the 
nuclear, molecular electron, and incident electron spins 
are individually conserved. The initial and final wave 
numbers of the incident electron, k, and ky, respectively, 
are related by 


E= (#?k.?/2m)+ Ea= (h?k,?/2m)+ Ep, (5) 
with EZ, and E, the energies of the corresponding 
molecular states ga and ¢». The differential cross section 
for scattering the electron into the direction n, with 
the molecule making the transition from state gq to 
$b, is 

Oap(n) Po (Re /Ra) | Aap | .. (6) 
We confine our attention to molecular states which 
can be classified as 12, in which event the Born- 
Oppenheimer approximation to gq or ¢» is 


¢(r;)=w(re,8)S(s) Y(0,®), (7) 


where r, refer to the molecular electrons only, w(r,,8) 
are the molecular electronic wave functions for fixed 


13 N. F. Mott and H. S. W. Massey, Theory of Atomic Collisions 
(Clarendon Press, Oxford, 1949), second edition, Chap. VIII. 

4 The ground states of homonuclear diatomic molecules gener- 
ally have this classification. In particular, Hz and Ne have !2 
ground states. For O2, with a 32 ground state, Eq. (7) requires 
some modification, but our evaluation of a(n) is not basically 
invalidated, since the rotational wave functions are still spherical 
harmonics. Our analysis is not applicable to non-2 states, the 
wave functions for which cannot be factored into a product of 
electronic and rotational wave functions, and for which the 
rotational wave functions are Jacobi polynomials. R. de L. 
Kronig, Band Spectra and Molecular Structure (Cambridge 
University Press, Cambridge, 1930), pp. 6 ff. 
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internuclear coordinate s, S(s) are vibrational wave 
functions, and Y(@,®) are spherical harmonics describ- 
ing the rotational states of the molecule. The coordi- 
nates (s,Q,®) of s are referred to fixed axes in space, 
and the nuclei are located at +4s. 

In the problem presently at hand, we seek the 
transition probability for rotational excitation only, 
with the molecule initially and finally in its ground 
electronic and vibrational state. In other words, in 
Eq. (7) we have wa=w,=wo, Sa=Ss=So, where the 
subscript 0 indicates the ground state. The electronic 
density distribution in the ground state of the molecule, 
for fixed s, is 


p(r,s)=N f dre: + -dry| wo(r,re,- + « ,tw,8) |?. (8) 


As usual, spin summation is implied in Eq. (8). It can 
now be concluded from Eqs. (3), (7), and (8), that for 
rotational excitation only, and in Born approximation, 
the matrix element A qs of Eq. (4) is 


Aap(to) = — (m/2ah?) f dreik-t 
x f Ode sinOV;*V.V'(r,0,6), (9) 
with k= &,no— Ryn and 


V'(r,0,0)= ; dss*|So(s)|2V"(rs), (10) 


ie | (11) 


[r—r'| 


Z Z 
a 
|r—38|  |r+38| 





Evidently V’ is the electrostatic interaction between 
the incident electron and molecule, for fixed orientation 
of the internuclear axis, averaged over the ground state 
electronic and vibrational wave functions. 

We now make a multipole expansion of V’. The use 
of this expansion is justified in Sec. IV on the grounds 
used to defend Born approximation, namely that the 
principal contribution to the cross section comes from 
large values of r. In 2 states p(r’,s), and, therefore, 
V’'(r,s), are axially symmetric!’ about s, so that the 
expansion has the form at large r 


V"'(t8)= — (2/rLnr"Pn(ts)Ba(s), 
y' (r,0,6)= i (e/r)> of *P (r,8)Cn, 


(12) 
(13) 


where we find, for even n, 
Ba(s)=22(45)"— f dep(e’s\r*Palrs), (14) 
Came f dss!|Su(s)|*B aC). (15) 


18 R. de L. Kronig, reference 14. 
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Here P,(r,8) is the mth Legendre polynomial in the 
angle between r and s, and e¢ is the absolute value of 
the electronic charge. B,(s) is zero for odd n, because 
V" (1,8) = V’’(—r, s). This symmetry follows from the 
property p(r,s)=p(—r,s), valid for the gerade or 
ungerade wo of homonuclear molecules."* Bo(s) vanishes 
because, in Eq. (14), fdr'p(r’,s)=N=22Z. In fact Eq. 
(15) makes C,, the nth electric moment of the molecule 
along the axis of symmetry s, consistent with the usual 
definition for an axially symmetric charge distribution 
g(r): 

Cn=(g(t)r"P,(1,8)). (16) 


Of course the monopole moment and all odd electric 
moments must vanish for neutral homonuclear mole- 
cules. 

The leading term in Eq. (12), proportional to the 
quadrupole moment, makes the principal contribution 
to the energy loss. Retaining this term then, and 
deferring until later the justification for neglecting the 
higher moments, we find from Eqs. (6), (9), and (13) 


oes(n) = (Fa/Be)(Oto/2m)® f drr-teik-t 
x f dOd® sinOV*V.P.(rs)|, (17) 


where d= h*/me’, and Q is the measured'® quadrupole 
moment of the molecule, in units of eao”. In Eq. (17) 
we may interchange the order of integration, whereupon 
the integral over r proves to be trivial. Also we now 
label the states a and 6 by the initial and final rotational 
quantum numbers Ja, M@, and Jy, M>, and perform the 
sums over the azimuthal quantum numbers M, and M;, 
thereby determining the effective cross section for a 
transition from rotational level J, to J;. There results: 


kp 407 a," 1 
Fap(n) =— ~>d | doyvs,”(s) 
Ra 9 2atl Ma Me 


x V4**(s) Po(k,s) f dQ'Y 1,Mo*(6') 


> 4 V s,™(s’) Po(k,s’), (18) 


where s is specified by the angles 0, ® in dQ and s’ is 
similarly specified by 0’, 6’ in dQ’. The sums over M, 
and M, are immediately evaluated. The effective 
differential cross section oa(n) then is seen to be 
spherically symmetric, so that the total cross section 
for the transition from level a to level 6 becomes 


SrQac? 


Cab= 


k 1 
_(2v+1) f dxPJa(x)Psy(x)Po(x). (19) 
a -l 


16 Caution is demanded in using reported values of the quadru- 
pole moment since several different definitions of this quantity 
are current, differing by numerical factors from the definition 
Eq. (16) which we adopt. 





1674 


The integral in Eq. (19) is known.!” It vanishes except 
when J,=Jo+2 or J,=Ja, in agreement with the 
usual selection rules for electric quadrupole transitions: 
| Ja—J»| <2 and no change in parity. We are interested 
in the inelastic processes only, J,=Ja+2. Our final 


expressions are 
8rQark, (J+2)(J+1) 
15 kg (23+3)(2I+1) 
8rQark, . J(J—1) 





OJ, J+2>= 





oJ, J-2= 


15 ka (2J—1)(2J-+1)' 


where oy, 742 refers to a transition from level Ja=J to 
level J,=J+2, in which the incident electron loses 
energy to the molecule, and oy, 7-2 refers to a transition 
from level J to J—2, in which the incident electron 
gains energy. To very good approximation, the energy 
levels Ey are 


E;=BJ(J+1). (21) 
Consequently, using Eq. (5), with ¢, the incident 
electron energy, we have in Eq. (20) for 


kp B 3 
Os, 742: ~-[1-—«+6)] ; 
ke &s 

kp B 2 
~=[14+—as-9)] R 


€a 


OJ, J—2: 


We conclude this section with the remark that the 
cross sections, Eq. (20), are not altered by the nuclear 
spin selection rule which results from the connection 
between spin and statistics. When the nuclei are 
identical and the molecule in a !2 state, rotational levels 
with the same total nuclear spin must have the same 
parity, (—)/. Since the nuclear spin cannot be changed 
by the potential V of Eq. (3), this implies the selection 





— eq. 26 
—-eq. 22 &23 











Fic. 1. Comparison of fractional energy losses computed from 


Eq. (26) or Eqs. (22) and (23). 


17E. U. Condon and. G. H. Shortley, The Theory of Atomic 
Spectra (Cambridge University Press, London, 1951), p. 182. 
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rule AJ even, which coincides with the selection rule 
already derived. 


III. NUMERICAL RESULTS 


The rate at which an electron of energy € loses 
energy in rotational excitation of the homonuclear gas is 


dw, 
=a >a Noles, r42(Es42—Es) 


—oy,s-2(Es— Es_2) ], 


where NV, is the number of molecules per cc in the Jth 
rotational level, oy, 742 and oy,s-2 are given by Eq. 
(20), and 2 is the electron velocity. Except at the very 
highest gas temperatures and lowest electron energies, 
the velocities of the molecules may be and have been 
neglected in Eq. (23) compared to v4. Particularly at 
the higher electron energies, k,/k, may be replaced by 
unity. This approximation yields the very simple result, 
independent of the relative populations Nj, 


dW ./dt= (32Q?a0?/15)N Bua, (24) 


where WN is the total number of molecules per cc. 
According to Eqs. (20) and (22), Eq. (24) always 
overestimates the energy loss computed from Eq. (23). 

In units of 2m/M, the mean fractional energy loss )’ 
of the electrons, per collision with the gas molecules, 
is defined in terms of the total rate of energy loss 
dW /dt by the relation 


\’= (M/2m) (NowWata) (dW /ds), 


(23) 


(25) 


where g; is the total collision cross section, elastic plus 
inelastic. \’ equals unity when the elastic cross section 
is spherically symmetric, the gas velocity negligible 
compared to the electron velocity, and inelastic losses 
are unimportant. If the inelastic losses are well approxi- 
mated by Eq. (24), then 


- 14 Sera? BM 


‘ (26) 
15 rEg 2m 


Losses in Nitrogen 


In nitrogen,’*" B=0.249X10- ev, |Q|=0.96. In 
Fig. 1 are compared }’ from Eq. (26) with \’ computed 
from Eqs. (22) and (23), for 0.025—0.6 ev electrons 
incident on Ne at 290°K, using” o,=4.8:a,? indepen- 
dent of electron energy and employing a Boltzmann 
distribution for the populations V;. The utility of the 
closed expression Eq. (26) for \’ may be gauged from 
this figure. In Fig. 2 the exact values of \’, from Eqs. 
(22) and (23), again for Ne at 290°K, are compared 
with the values of \’ obtained by Crompton and Sutton® 
(curve C) when they assumed their electrons had a 


18G. Herzberg, Molecular Spectra and Molecular Structure I 
(D. Van Nostrand, New York, 1950), Table 39. 

19 W. V. Smith and R. Howard, Phys. Rev. 79, 132 (1950). 

*® Phelps, Fundingsland, and Brown, Phys. Rev. 84, 559 (1951). 
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Fic. 2. Theoretical energy losses obtained by assuming 
various elastic cross sections. 


Maxwellian distribution. Since the magnitude and 
energy dependence of o; are uncertain, two “exact” 
curves are drawn in Fig. 2, curve A using the afore- 
mentioned constant value of o,=4.87a9?, and curve B 
using an energy dependent o, (B’ of Fig. 3). The 
various reported values*°-” of the cross section™ for 
electrons in Ne, in the energy range below 0.8 ev, are 
summarized in Fig. 3. Curve A’ in Fig. 3 is the constant 
value o;=4.87a,?, which is seen to lie below the other 
reported values, but which lies very close to the theo- 
retical estimate by Fisk™ of the elastic cross section. 
Curve B’ is drawn through the experimental points of 
Crompton and Sutton,* and extrapolated from their 
observations. Comparison of curves A and B in Fig. 2 
illustrates the influence on the theoretical estimate of 
\’ of differing assumptions concerning the magnitude 
and energy dependence of o;. These curves also show 
that the predicted fractional energy loss by rotational 
excitation is an order of magnitude greater than the 
value \’=1 expected for elastic losses only. 

The analysis of the raw data of the swarm experi- 
ments to get the fractional energy loss as a function of 
mean energy is complicated and involves numerous 
assumptions, e.g., the electroris have a specified distri- 
bution (Maxwellian or Druyvesteyn) and a mean free 
path independent of velocity. For this reason the 
experimental curve C (Fig. 2) represents what might be 
called an effective energy loss vs mean energy in the 
swarm, and is not simply related to the average of 
curves A or B over the distribution function. None- 


21 C, Ramsauer and R. Kollath, Z. Physik 4, 91 (1930). 

# J. Townsend and V. A. Bailey, Phil. Mag. 42, 873 (1921). 

% Tt is not always clear whether the experiments measured o;, 
the total cross section, or merely o,, the elastic cross section. 
The distinction is not important, however, since the inelastic 
cross sections oy, 742 and oy, 7-3 of Eq. (20) turn out to be much 
smaller than the reported values of o:. 

“J. B. Fisk, Phys. Rev. 49, 167 (1936). 
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theless we may conclude from curves A or B and C, as 
they stand, that rotational excitation, occurring through 
the coupling between the electron and the molecular 
quadrupole field, is of the right order of magnitude” to 
account for the observed energy losses in Ne, at energies 
below the vibrational threshold (0.29 ev, indicated by 
the arrow on the energy scale in Fig. 2). If vibrational 
excitation is in fact negligible, the electron distribution 
function can be computed by numerical integration of 
the Boltzmann equation, using some assumed elastic 
cross section and the theoretical inelastic cross sections 
of Eq. (20). From the distribution function there can 
be determined, again using Eq. (20), the expected drift 
velocities and diffusion coefficients which are the raw 
data of the swarm experiments. Thus it is possible in 
principle to make a detailed comparison of our theory 
with the experimental results. In view of the previously 
mentioned experimental and theoretical uncertainties 
in the swarm experiments, and our present inaccurate 
knowledge of the elastic cross section o; and quadrupole 
moment”® Q, such a comparison probably would be no 
more than qualitatively significant. However, with 
any reasonable assumptions concerning the errors in 
the swarm experiments, and the values of o; and Q, 
it is unlikely that such a comparison would not bear 
out an obvious inference from Fig. 2—namely that 
rotational excitation becomes relatively unimportant 
at energies above the vibrational threshold, and conse- 
quently that vibrational excitation becomes important. 
As explained in Sec. IV the approximations leading to 
Eq. (20) are increasingly inaccurate in N2 as the inci- 
dent energy increases above the vibrational threshold, | 
so that the foregoing inference may not stand up. 
Nonetheless it appears worthwhile to re-examine the 
theoretical predictions of only small energy loss by 
vibrational excitation,’:°" particularly since the similar 
theoretical estimates of the rotational loss appear to 
have been too small. 





O Romsauer 
& Kollath 


o Townsend 
&Bailey 
— 6B 
-—- Fisk 
—— A 
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Fic. 3. Elastic cross section data. 


26 We remark that curves A or B are indistinguishable from 
computed curves taking into account corrections to the rotational 
spacing, ie., using Ey=BJ(J+1)—DJ*(J+1) with D=0.72 
X10~* ev. See Herzberg, reference 18. 

% Gordy, Smith, and Trambarulo, Microwave Spectroscopy 
(John Wiley and Sons, Inc., New York, 1953), p. 294. 
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Laboratory cross modulation experiments provide 
further data. When the average electron energy é, is 
very nearly equal to the mean kinetic energy € of the 
molecules, the rate of loss of energy per electron usually 
is taken to be 

(dW /dt) =Gv(éa— €0), (27) 


where » is the collision frequency and G a numerical 
factor. For a Maxwellian distribution of electrons, 
making elastic collisions only,?” G=8m/3M. Goldstein* 
finds Gy=6.4X10®5 sect. In the cross-modulation 
experiment the electron distribution presumably is 
Maxwellian, at a temperature nearly equal to the gas 
temperature. Averaging Eq. (23) over a Maxwellian 
distribution, and dividing by (€.—€o), we can compute 
the theoretical value of Gy implied by our rotational 
excitation cross sections. We find Gv=13X10* sec“, 
independent of o; [which does not appear in Eq. (23) ], 
for an electron temperature of 290°K, equal to the gas 
temperature. Using o;,=4.82a¢", Goldstein’s value of 
Gy corresponds to G=51(2m/M), and our value to 
G=100(2m/M). These results again demonstrate that 
rotational excitation can account for observed losses 
much larger than elastic. 


IV. VALIDITY OF APPROXIMATIONS 


The following approximations employed in Sec. II 
require discussion: (i) Born approximation; (ii) the 
multipole expansion of V’, Eq. (13); (iii) the neglect 
of moments higher than quadrupole in obtaining Eq. 
(20). Granting (i) and (ii), justification of (iii) is not 
difficult. Retaining the higher moments in Eq. (13) 
we find, by the same procedure as was used to obtain 
Eq. (19), that the differential cross section for transi- 
tions from rotational level J, to J; is 


eoritntE 
oap(n)=— 
een eS, . iia ade) 





1 
x f dxPJq(x)Ps4(x)Pn(x), (28) 


where the sum is over even u only, of course, and we 
have introduced 
Cn=Q ned"; (29) 


i.e., Q, is the mth electric moment of the molecule, in 
units of ea”. In these units, for Ne or any likely homo- 
nuclear diatomic molecule, Q, is a number at most of 
order unity, and probably decreases rapidly with 
increasing n. The maximum magnitude of k is kath, 
attained when n is antiparallel to mo. As k is always 
nearly equal to ka, we estimate that for 0.6-ev electrons 
incident, at which energy in N2 our computed rotational 
losses become about equal to elastic (Fig. 2), the 
maximum value of kao in Eq. (28) is 0.42. Since the 
series in Eq. (28) is an expansion in powers of (kap),* 


27 A. M. Cravath, Phys. Rev. 36, 248 (1930). 


it is apparent that at the low energies in which we are 
interested, the moments higher than quadrupole will 
make a small or negligible contribution to the cross 
sections Eq. (20). 

In order to examine the validity of (i) and (ii), we 
write the matrix element, Eq. (4), as the sum of two 
amplitudes: 

A ab(1,No) =Ai+A 2) (30) 


where the “near-field” amplitude is 


Ai=—(m/2ri#) [dr f de expl—iton-t Joo) 
$0 XV (rr)¥e(r,r), (31) 


and the “far-field” amplitude is 


dr | dr; exp[—ikyn-r]gs*(r;) 
XV (r,r)Va(r,rj). (32) 


The distance 7p is so chosen that only a negligible 
fraction of the molecular charge distribution lies outside 
ro. In other words, the integral in A; extends over the 
interior of the molecule, and that in A2 over the exterior 
of the molecule. Az, but not Ai, can be correctly 
evaluated using the multipole expansion. 

We assume for the moment that A, is negligible in 
Eq. (30), and that Born approximation is valid in 
Eq. (32). We are thereby led to the cross section 


kp 
Tap(N) = — )(2I 6+ D> (kao)? 0»: ry 


1 1 
ger J a (33) 


In deriving Eq. (28), (i) and (ii) were assumed valid 
for all 7, i.e., ro was assumed equal to zero in Eqs. (31) 
and (32), which made A; identically zero. Thus as 7 
approaches zero in Eq. (33), that equation becomes 
identical with Eq. (28). At p=1 the functions 7,_1(p)/ 
p" appearing in Eq. (33) are only ten percent different 
from their values”* at p=0. If kro does not exceed unity, 
therefore, oa,(n) of Eq. (33) will differ from oa,(n) of 
Eq. (28) by at most twenty percent. Also the quadru- 
pole contribution in Eq. (33) will be as dominant as in 
Eq. (28). We thereby have reduced the problem of 
justifying Eq. (20) to demonstrating (a) kro <1; (b) (i) 
is valid in Eq. (32); and (c) A: of Eq. (31) is negligible. 

In Ng, the internuclear distance in the ground vibra- 
tional state!® is 2.1¢9. The atomic radius of nitrogen is” 
close to 1.0ao. Consequently, it is reasonable to assume 
that in Ne the charge distribution is mainly confined to 


28 Tables of Spherical Bessel Functions (Columbia University 
Press, New York, 1947). 

»D. R. Hartree and W. Hartree, Proc. Roy. Soc. (London) 
A193, 299 (1948). 
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the interior of a sphere of radius 79>=2.1a0, centered at 
the origin midway between the two nuclei. Thus in Ne, 
for 0.6-ev electrons incident, the maximum value of kro 
is 0.88. Moreover since ky is nearly equal to ka, and the 
differential cross sections leading to Eq. (20) are 
spherically symmetric, a mean value of k= | kyn—kano| 
is V2ka, which makes kro=0.62 in No, for 0.6-ev electrons 
incident. We infer that in Ne, for incident electron 
energies <0.6 ev, condition (a), kro<1, is satisfied, 
and in fact so well that the differences between cross 
sections computed from Eqs. (20) and (33) are not 
significant, especially at the lower incident electron 
energies. 

It is more difficult to demonstrate condition (b). An 
indication of the order of magnitude of the distance 7; 
beyond which the distortion of the incident wave 
function is small is provided by the criterion: 

Ve ae?/re=h?k2/2m, (34) 
which makes 7; the distance at which the incident 
electron energy equals its interaction energy with the 
molecule. This estimate of 7; is based on the conven- 
tional view that Born approximation is valid when the 
interaction energy is small compared to the kinetic 
energy, as is the case at distances r>r;. From Eq. (34) 
we obtain, with k=v2ka, 


kr, =V2 (20k ad0)!. (35) 
Equation (35) makes kr; just about equal to unity for 
0.6-ev electrons in Ne. As the incident energy decreases, 
kr; approaches zero, though 7 increases and becomes 
infinite at zero incident energy. We have already seen, 
from comparison of Eqs. (20), (28), and (33), that the 
principal contribution to the scattering amplitude of 
Eq. (9), or to Az of Eq. (32), comes from distances r 
such that kr>1. Equation (35) suggests therefore that 
for Ne, with incident electron energies <0.6 ev, Born 
approximation is valid, increasingly so as the incident 
energy is decreased, since at these low energies the 
principal contribution to Eq. (32) appears to arise from 
distances r at which the wave function is only slightly 
distorted from its incident form. 

The criterion Eq. (34) is not theoretically sound 
because at any distance 7 the distortion of the wave 
function from its incident form depends not merely on 
the value of the interaction at that r, but also on the 
values of the interaction at all other distances, in 
particular at smaller distances, where the interaction is 
larger. Moreover the distortion of the wave function 
at small distances may be so great as to result in an 
appreciable contribution to the scattering amplitude 
from distances kr <1, even though these distances are 
not significant in first Born approximation. We have 
further examined this question by evaluating the second 
Born approximation to the scattering amplitude for a 
pure quadrupole interaction, ie., in Eq. (13) V’ 
= —eCy*P,(r,s) for all r. The difference between the 
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first and second Born approximations is a first estimate 
of the error made in Eq. (4) by neglecting the distortion 
from ga(r;) exp(ikamo-r) of Va. Details of the calcu- 
lation (which involves some approximations) are given 
elsewhere,” and it is found that 


Aw®~(1+0.150hado)A a, i 


where Ag” and Aq are respectively the first and 
second Born approximations to A of Eq. (4), using a 
pure quadrupole interaction. From Eq. (36), Aa®/ 
A» approaches unity as k,—0, and in Ng, for 0.6-ev 
electrons, equals 1.03. Thus we conclude that our 
qualitative argument based on Eqs. (34) and (35) did 
not lead us astray; that in computing the “far-field” 
amplitude A» of Eq. (32) (first) Born approximation is 
increasingly valid as the incident energy decreases; 
and that in No, for incident electron energies <0.6 ev, 
the error in first Born approximation to A» probably is 
not appreciable, although since the factor in Eq. (36) 
is admittedly approximate, an error of 10 percent or 
more in |A2|? cannot be ruled out at energies close to 
0.6 ev. 

To complete the justification of Eq. (20) we must 
demonstrate (c) A; of Eq. (31) is negligible. This we 
do by evaluating A; in Born approximation, although 
we recognize that in the region r<7o Born approxi- 
mation is not valid. In default of a better way to 
estimate A, however, we trust that Born approximation 
does give a measure of the magnitude of A. It turns 
out, as we shall show, that in Born approximation, for 
incident electron energies <0.6 ev in Ne, A; is small, 
though perhaps not negligibly so at 0.6 ev. We con- 
clude therefore, that our theoretical estimates of the 
energy losses in Ne, especially at energies below the 
vibrational threshold, probably would not be signifi- 
cantly altered by including in A..(n,mo) of Eq. (32), 
along with A», the correct “near-field” amplitude A, 
somehow arduously computed from Eq. (31). 

Our conclusion that A; is small is supported and 
made understandable by the argument of Sec. I, indi- 
cating that for a short-range interaction, because the 
electron cannot both go in and come out as an s-wave, 
the cross section for rotational excitation by slow 
electrons is small. To elaborate somewhat, at distances 
r<ro the multipole expansion is not legitimate. How- 
ever, the effective interaction always can be expanded 
in spherical harmonics. In Born approximation, for all r, 


V'(r,0,8) = —edon fn(r) Palt,s), (37) 


where V’ is defined by Eqs. (10) and (11), the sum is 
over even m only, and f,(r)=C,r-"" for large r. As 
we have seen, at the low energies of interest, even 
when the multipole expansion is assumed valid all the 
way to r=0, which makes f,(r) highly divergent at the 
origin, the contribution to the inelastic amplitude Aq, 
from distances r<ro is small. Thus we expect A; to be 


%S. Stein, thesis, University of Pittsburgh, 1955 (unpublished). 
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Fic. 4. Ratio of P: terms used in computing the near-far-field 
amplitudes. y= ef2/(Qace?/r*). 


small for small kro; still A; may not be negligible as kro 
nears unity if there is a region r<ro in which the terms 
f(r) are appreciably larger than their corresponding 
asymptotic forms C,r—-"—!. Of course the contribution 
to A; from the spherically symmetric term n=O in 
Eq. (37) has not been assessed by extending the 
multipole expansion to r=0, since the molecule has no 
monopole moment. But a spherically symmetric inter- 
action cannot of itself cause a rotational transition. 
Consequently the short-range term fo(r) has to cause 
rotational transitions not only through incident and 
outgoing waves which, with decreasing incident energy, 
have a vanishingly small probability of being found at 
r<ro, but also only in higher approximation, through 
waves which have already been scattered by the long- 
range non-spherically symmetric part of the interaction. 
An estimate of the magnitude of A; is obtained on 
the assumption that the charge distribution is composed 
of two spherically symmetric parts, each centered about 
a nucleus. Such a charge distribution has vanishing 
multipole moments of all orders, so that A» is zero in 
Eq. (32), and A; becomes precisely the total inelastic 
amplitude Aq, computed, in Born approximation, by 
Morse.’ We have, extending now 7) to © in Eq. (31), 


=—(m/2ntt) f drdse+| o(3)| 


X Vsa™e(s) Vs,¥0*(s)V"" (r,s), (38) 


where V” is identical with V” of Eq. (11) but is now 
specifically assumed to have the form 


v’'(1,8)=U(|r—3s|)+U(|r+3s|). (39) 
Then, following Morse, 


Ar=2faO) [ ds| Sls) |*¥s-8(9) 
XY s,™*(s) cos(}k-s), (40) 


STEIN 
with 
fa(0) = — (m/2nh) f dre**U() 


=—(2m/i8) [aul @). (4) 


In Ng at low energies, recalling that 7o is the internuclear 
distance in the ground vibrational state, the leading 
term in Eq. (40) is, since J,=Jo+2, 


A1=—10_(6) ja($hr0) f do¥ 1.¥(8) 
X Vs,™**(s)Po(k,s). (42) 


The inelastic amplitude A» for this transition, from 
Ja, Ma to Js, M; is, as can be verified from Eq. (18) 


dentin f dV 1,a(s)V14¥**(s)Po(k,s). (43) 


Thus, for small &, 
A1/A2= fa(8) (3hr0)’/Qao. (44) 


In nitrogen, if we use the parameters of Duncanson 
and Coulson,*! the effective charge density p(r) con- 
tributing in the low-energy limit to f.(6) of Eq. (41) 
arises almost entirely from the 2s and 2 electrons, and 
is very closely represented, in atomic units, by 


5p 
p(r . er, (45) 
ra 


with w=1.95. We find f.(@)=3.5a0, so that from Eq. 
(44), with k=v2k,, 
A 1= 8 (RaQo)2A 2- (46) 


At 0.6 ev, Eq. (46) yields A:/A2=0.35 which, theugh 
small, is not negligible. Because it is proportional to 
ka?, A:/Az does become negligible at lower energies, 
below the vibrational threshold. 

The approximations leading to Eq. (42) are such 
that A, therein depends only on the coefficient of 
P,(r,s) in the expansion of V’’(r,s), Eq. (39). The 
coefficient ef2(r) of Eq. (37) can be computed for the 
potential U(r) resulting from the charge distribution of 
Eq. (45), by fixing the nuclei at }s=+1.05a, and 
making use of the expansion® in Legendre polynomials 
of exp(— 2u|r—4s|)/|r—4s]|. In Fig. 4 we plot the ratio 
y for Nz of the thereby determined ef2(r) to Qe*ag’r. 
It is seen that ef2(r) greatly exceeds the pure quadru- 
pole interaction in an extended region about r=r0/2. 
This makes reasonable the fact that A;/As turns out 
to be non-negligible at energies near 0.6 ev, where 
kro=0.62. 

31W. E. Duncanson and C. A. Coulson, Proc. Roy. Soc. Edin- 
a A62, 37 (1944). 


N. Watson, Treatise on the Theory of Bessel Functions 
(Cambridge University Press, London, 1952), pp. 80 and 366. 
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It also is possible to estimate A; as does Morse’ in a 
semiempirical fashion from the known elastic scattering 
cross section a, noting that in the same approximation 
as Eq. (40) the elastic scattering amplitude is 
2fa(0)jo($kro). With o:=4.87a¢? we obtain f.(6) =0.55a0, 
which, substituted in Eq. (44), implies A;/A:2 is about 
0.05 at 0.6 ev in Ne. Hence this method of estimating 
A, indicates it is in fact negligibly small even at 0.6 ev; 
in any event it supports the view that Eq. (46) is not 
a gross underestimate of A1/A>. 

Finally, we mention some other “near-field” effects 
which, like A;, depend on details of the short-range 
interaction and are decreasingly important as the 
incident energy approaches zero. These effects include: 
(1) distortion of the wave function by a very large 
ef2(r)P2(r,s) interaction, such as was inferred (Fig. 4) 
from the parameters of Duncanson and Coulson,*! 
thereby possibly modifying the estimate from Eq. (36) 
of the ratio of the second to first Born approxima- 
tion; (2) the contribution, appearing in second Born 
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approximation, which the short-range spherically sym- 
metric fo(r) term of Eq. (37) makes to the inelastic 
amplitude A.s; (3) electron exchange, which plays no 
role in the “far field,” where the incident and atomic 
electron wave functions do not overlap. To sum it up, 
our approximations are of such a character that for any 
homonuclear gas, not merely Ne, the cross sections of 
Eq. (20) are increasingly reliable as the incident 
electron energy decreases to zero, because with di- 
minishing energy the long-range tail of the interaction 
becomes increasingly important. 
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The line width of the Tl and Tl°> nuclear magnetic resonance 
in thallium and thallium oxide greatly exceeds the dipolar width, 
and is a function of the abundance of the other isotope. The re- 
sults can be interpreted in terms of an exchange interaction AI,- I: 
between a pair of nuclear spins which exceeds the normal dipolar 
interaction. The exchange between different isotopes leads to 
broadening. Exchange between like nuclei should lead to narrow- 
ing, but it was found that samples containing 98.7 percent TI? 
still exhibit lines broader than the dipolar interaction. Two causes 
are shown to exist: anisotropy of the chemical shift and pseudo- 
dipolar exchange interaction. Analysis with the method of the 
moments gives for the exchange interaction constant Ah“=17.5 
kc/sec with a 30 percent anisotropic pseudo-dipolar character in 
the hexagonal metal, and Ak-!=12 kc/sec with less than 10 


I. INTRODUCTION 


N an earlier paper! an anomalous behavior of the 

TI and Tl?°> magnetic resonance lines in metallic 
thallium had been noted, but no satisfactory explana- 
tion was given at that time. It was found that the 
width of the TI? resonance was about 10 times as 
large as could be expected from the dipolar broadening, 
but even more anomalous was the fact that the TI? 
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Metallurgical Company, Box 580, Niagara Falls, New York. 
( 053) Bloembergen and T. J. Rowland, Acta Metallurgica 1, 731 
1953), 


percent pseudo-dipolar character in thallic oxide. The oxide has 
a chemical shift of +-0.55 percent with an anisotropy of 34 percent 
of this amount. The metal exhibits a shift of 1.56 percent with 16 
percent anisotropy. 

Ramsey’s theory of the nuclear spin exchange via excited 
electron states in molecules, is extended to solids. Most heavy 
isotopes in metals and insulators should exhibit exchange effects. 
From the anisotropy of the exchange, information about the 
relative amount of # or d character of the electron wave function 
in the solid can be obtained. 

It is predicted that thallic oxide has a nuclear Curie point at 
3.5X10-* °K. Whether it will become nuclear ferromagnetic or 
antiferromagnetic depends on details of the electronic band 
structure. 


resonance was again much broader than the TI” 
resonance. The two isotopes both have a spin =}, and 
the magnetic moment of Tl?® is only one percent smaller 
than that of Tl?®. Quadrupolar effects are thus excluded. 
The only reason why the two isotopes could behave 
differently seemed to be contained in the fact that they 
occur in unequal abundance. Natural thallium contains 
29.5 percent TI and 70.5 percent TI*°*. Consequently 
a TI nucleus has fewer identical neighbors than a 
TI nucleus. The dipolar width of the Tl? resonance 
should therefore be smaller than that of TI. An 
exchange interaction of the type A12I,-I, between the 
nuclear spins would act in the opposite direction. 
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Whereas the exchange between like spins causes ex- 
change narrowing, the exchange between unlike iso- 
topic species would cause broadening. The effect of 
like and unlike magnetic ingredients with dipolar and 
exchange interaction on the width of magnetic reso- 
nance lines has been discussed extensively by Van 
Vleck.? If the exchange interaction between two thallium 
nuclei were ten times as large as the classical dipolar 
interaction, the observations in natural thallium could 
be explained. 

This required magnitude at first sight appears im- 
probably large. The exchange type of coupling between 
nuclear spins in molecules is well known both experi- 
mentally** and theoretically.* It occurs via the inter- 
mediate excitation of electron orbits. Clearly similar 
effects could be expected in solids. The order of magni- 
tude of exchange interaction is A= (Wht,)?/AE where 
Wnhss is the hyperfine interaction in the molecule or 
solid and AE is an appropriate average distance of the 
excited electronic state from the ground state. 

Tn all observed molecular spectra the exchange coup- 
ling is only a small fraction of the dipolar coupling. 
In the HD-molecule, e.g., the exchange is 43 cps, 
whereas the dipolar interaction is more than a hundred 
times as large. In the molecule undergoing frequent 
collisions in the liquid or the gas, the dipolar inter- 
action averages out to zero making the exchange effect 
observable. In a rigid lattice of light elements the 
exchange effect would be completely obscured by the 
dipolar interaction. All observations on molecules have 
been done on light isotopes, mostly H, D, F and P*'. 
The exchange interaction could be much larger for 
heavier compounds. The hyperfine interaction in atomic 
thallium is, e.g., twenty times larger than for hydrogen. 
If one makes the crude assumption, that the hyperfine 
interaction in metallic thallium is also twenty times 
as large as in the hydrogen molecule and that the 
“average excited state’”—the meaning of this expression 
will be made more precise later in this paper—is the 
same for thallium and the hydrogen molecule, an ex- 
change interaction of 17 kc/sec is obtained between a 
neighboring pair of thallium nuclei in the metallic 
lattice. This would have the right order of magnitude 
to explain the experimental observations. It might be 
expected that in most compounds with predominantly 
heavy isotopes the exchange effects will outweigh the 
classical dipolar interaction. 

In order to test the hypothesis of a large exchange 
interaction, experiments have been carried out on a 
series of thallium samples with different isotopic com- 
positions, as the effect of exchange interaction is 
markedly different between like and unlike pairs. These 


2J. H. Van Vleck, Phys. Rev. 74, 1168 (1948). 
3 E. L. Hahn and D. E. Maxwell, Phys. Rev. 88, 1070 (1952). 
( 4 a McCall, and Slichter, J. Chem. Phys. 21, 279 
1953). 
5N. F. Ramsey, Phys. Rev. 91, 303 (1953); N. F. Ramsey and 
E. M. Purcell, Phys. Rev. 85, 143 (1953). 
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experimental results are described in Sec. II. The 
phenomenological theory is described in Sec. III and 
the experimental results are interpreted in terms of this 
theory in Sec. IV. It proved necessary to introduce a 
tensor-type exchange, or pseudo-dipolar, interaction in 
addition to the scalar interaction. Furthermore evi- 
dence for the anisotropy of the chemical shift in 
thallium oxide is presented. This type of anisotropy has 
not been reported before in the literature. 

In the remaining sections an atomistic interpretation 
of the phenomenological exchange constants introduced 
in Sec. III is presented. It is an extension of Ramsey’s 
theory® for molecules to the periodic lattice. The inter- 
action is a kind of superexchange via intermediate 
excited electronic states. A second-order perturbation 
calculation in the electron-spin-nuclear-spin interaction 
is required. 

While this research neared completion, Ruderman 
and Kittel independently put forward the hypothesis 
of nuclear spin exchange in metals via the intermediary 
of the conduction electrons.* Their theory was de- 
veloped along the same lines mentioned above. We ex- 
tend their method to include the case of insulators and 
the pseudo-dipolar interaction. The latter is especially 
important in pure or nearly pure isotopes and gives 
valuable information about the angular dependence of 
the wave function in the solid. An interesting feature 
of nuclear spin exchange is that theoretical expressions 
for its magnitude can be given. A detailed study of the 
theoretically simpler nuclear exchange may serve to 
elucidate problems in magnetism related to electron 
spin exchange. 


Il. EXPERIMENTAL RESULTS 


The equipment was the same as used in the earlier 
investigation.! A permanent magnet provided a field 
of 5560 oersted in a 1} inch gap. Field values were 
always corrected for temperature variations (—1.0 
oersted/degree). Experiments at a lower field of 3288 
oersteds were carried out with an electronically current- 
regulated electromagnet. The pole faces were 5 inches 
in diameter and the inhomogeneity across the sample 
was always less than 0.2 oersted, It never contributed 
significantly to the width of the observed broad lines. 
The field was modulated at 280 cps and the nuclear 
absorption was detected with a radio-frequency spec- 
trometer of the Pound-Knight-Watkins type.’ The 
output of the 280-cps “lock-in” detector, which repre- 
sents the derivative of the absorption curve, was re- 
corded on an Esterline-Angus recording instrument. 
For the detection of the weak lines the time constant 
of the lock-in was made as long as 30 seconds. The 
scanning rate was usually about 2 kc/min, and it might 
take as long as 90 minutes to go completely through a 


6M. A. Ruderman and C. Kittel, Phys. Rev. 96, 99 (1954). 
We are indebted to Prof. Kittel for making this manuscript 
available before publication. 

7R. V. Pound and W. D. Knight, Rev. Sci. Instr. 21, 219 (1950). 
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resonance line. To obtain a more favorable signal-to- 
noise ratio all data were taken at 77°K. A few checks on 
the temperature dependence of the lines were made 
between 77°K and 300°K. No dependence of the line 
breadth on temperature was found. 

The samples consisted of finely powdered metallic 
thallium and thallic oxide. The metal particles had a 
diameter small compared to the skin depth and were 
suspended in paraffin oil. The metal and the oxide of 
natural abundance were obtained commercially (cp 
grade). Enriched isotopic samples of Tl,0; were ob- 
tained from the AEC stable isotope division, which also 
provided the mass and spectrographic analysis listed 
in Table I. 

The impurity content is low enough not to affect the 
breadth or shape of the resonance. The amount of 
paramagnetic impurity was high enough to provide a 
conveniently short relaxation time in the oxide, avoid- 
ing saturation effects: The relaxation time in the metal 
is determined by the interaction with the conduction 
electrons. It may be estimated with the Korringa rela- 
tion® from the observed relaxation time 7;=2X10™ 
sec in copper® at 300°K, and the known Knight shifts 
of 0.23 percent for copper” and 1.56 percent for thallium. 
One finds T,;=10~ sec in thallium at 300°K, and 3.9 
X10-> sec at 77°K. The contribution to the width 
Aw/2m from the finite relaxation time is therefore about 
4 kc/sec at 77°K. This agrees qualitatively with the 
absence of any temperature effect on the observed line 
widths in the metal, which are always larger than 16 
kc/sec. Experimentally the effects of T, are negligible. 
It may well be that the Korringa relation gives too 
small a value for 7; in thallium. 

After data on the enriched oxide samples had been 
taken, they were reduced by holding at 260°C in a 
slow stream of pure hydrogen gas. Complete reduction 
was established by weighing and the metal was ob- 
tained in the form of a fine powder, directly suitable 
for the nuclear resonance experiment. This procedure 
had first been tested by the reduction of oxide of the 
natural composition. 

Data were taken on all samples, the oxide and the 
metal with five different isotopic compositions, each at 
two external field strengths, H.xt=5560 and 3288 
oersteds, respectively. Both the Tl and the Tl” were 
recorded. These isotopes have nearly the same gyro- 
magnetic ratio! go5/g203= 1.009838, and g205= 2m? res/ 
Ayes=1.546X10* sec! oersted™. This last value is 
determined on the assumption that in a concentrated 
aqueous solution of thallium acetate Hre=Hext. In 
other words, this value is not corrected for diamag- 
netism of the core or a small chemical shift of less than 
two parts in ten thousand, which may exist in the solu- 


8 J. Korringa, Physica 16, 601 (1950). 

9A. E. Redfield (to be published). 

10 Townes, Herring, and Knight, Phys. Rev. 77, 852 (1950). 
1H. S. Gotowski and B. R. McGarvey, Phys. Rev. 91, 81 
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TABLE I. Composition of enriched thallium samples. 








Dominant 
impurities 
(percent) 


0.1014 Si: 0.08 
1.7005 Mn: 0.04 


: Pb: 0.01 
commercial {Fe <0.001 


0.3230 V <0.04 


Fe <0.04 
Mn <0.02 
Si <0.08 


Quantity of 
element in 
grams 


Abundance 
of T1205 


(percent) 


98.7+0.5 
90.50.5 





70.5 


52.10.5 


14.0+0.5 0.4161 








tion and depends on the concentration of the acetate 
ions." It was found that the resonance field in the 
oxide was 0.55 percent higher than the external field. 
This indicates a chemical shift of +0.55 percent in this 
solid. The metal exhibits a Knight shift of 1.56 percent. 

Typical recordings are reproduced in Fig. 1 for the 
oxide in the higher field. It is seen at a glance that the 
hypothesis of a large exchange interaction is correct. 
The Tl? line becomes narrower, the smaller the Tl?® 
concentration, and the Tl? line becomes narrower, the 
smaller the Tl concentration. For the 48 percent 
TI-52 percent TI**> composition the lines are nearly 
identical, as each isotope has the same average number 
of nonequivalent neighbors. 

The resonance in 98.7 percent pure T]*> is of special 
interest. One would expect a line narrower than the 
dipolar width due to exchange narrowing between like 
neighbors. The observed resonance is still several times 
broader than the dipolar width and exhibits a marked 
asymmetry. A similar type of asymmetry had been 
noted previously in powdered tetragonal tin and was 
interpreted as an anisotropy of Knight shift. The 
hypothesis that the asymmetric line in the powdered 
(T1?5).0; is due to an asymmetry in the chemical shift 
is confirmed by the recordings at lower field. Whereas 
the dipolar and exchange broadening are independent 
of Ho and the width of the Tl? resonance in samples 
with a relatively high Tl?* concentration is field inde- 
pendent, the width of the 98.7 percent Tl resonance is 
much smaller at the lower field and the asymmetry is 
markedly reduced. The integrated absorption curves 
obtained from the experimental recordings are shown 
in Fig. 2. The crystal structure of T1,03 is cubic,” but 
anisotropy is possible as the individual thallium nuclei 
are not in positions of cubic symmetry. The unit cell 
contains thirty-two thallium atoms. Eight of these are 
located on the body diagonals of the unit cell and there- 
fore have threefold axial symmetry. They have six 
nearest neighbors at 3.35 A, and six others at 3.95 A. 
The other twenty-four thallium atoms are in position 
of low symmetry. They have four neighbors at 3.35 A, 


2 R. W. G. Wyckoff, Structure of Crystals (Interscience Publica- 
tions, New York, 1931), p. 253. 





“suortsoduios s1doj0s! snoLIeA 10; *Q%[], Ul UOT}diosqe a.URUOSAI J1}aUZeW IvajNU dy} JO AAIVBALap 24} JO SBuIps0de1 JeJUWILIEdKY “] “Oly 


ai ad Ala A COCR EHO C CCHF ORDER EE HEME O PASE Sees eO FO F089 
Pee Cr A SA CO tea nem = ee VER SON Sm te 7 


4 
y 


* x cs i aes es yA int a % 

















Sp en te ee Aeane 








s ae 








i 











Palate tenes Araeeusnl cpp ape pene prlinctel lg 


To 


A+ 


















































we @ Oe eee 
i ae 





eo ee SAS 



































ROWLAND 



























































J. 











































































































tC 








"| be 

















Le 


oan 
eee 








% 





























“4-5 nee 


+ 
Be 
ua 
i 
ji 


= 


“sad... 





ST 
oh} 


7 7 f= socket e- = NG 177 vi p—}- 


eS ls bk we Oe) acess 


*GNOOSS Yad SSTANOWW Ni SYSAHYVW AONSNOSYS 


old IW *0%L 4O STIMWVS YSOMOd G3HOINNS Ni 
G3LSHIO O9SS=°H NI SSONVNOS3Y coztl ONV cogil 








- 
a 
Zz 
<x 
Zz 
es) 
) 
ee 
i 
ea 
— 
ca) 
° 
4 
a) 
Zz 


























ae ae ‘ 
Qo temetceaneneneeo ee ® es ss 





NUCLEAR SPIN EXCHANGE IN SOLIDS 


two at 3.48 A, four at 3.75 A, and two at 3.95 A. Other 
neighbors are relatively much farther away. 

Experimentally the distance between the maximum 
and minimum in the derivative curve is most accurately 
determined. This width between points of maximum 
slope, Avmsi, is unfortunately not accessible to direct 
theoretical interpretation. The second moment of the 
line, 


(avn f Af oar, fro e=1, 


has more theoretical significance, but especially the 
contribution from the tails of experimental curves is 
rather hard to evaluate with precision. The method of 
evaluation has been described by Pake and Purcell.” 
Table ITI lists the values of Avms: and [(Av*)y }#, deter- 
mined as an average over several recordings. The ratio 
in the last column of Table II should be 2.0 for a pure 


TABLE II. Line width of nuclear magnetic 
resonance in Tl,O; in 77°K. 








T1* resonance 
Percent 
abundance [(Av®)ay]? 
T1205 ke/sec 


Hext in Avmsl 
oersted kc/sec 


7 5560 8.3 
Lo 5560 10.5 
.9 5560 20. 
3 5560 32. 
0 

7 

By) 


Avmsi 
[(Av2)ay]? 





5560 > 60. 


: 3288 5.6 
3288 7.4 
3288 


98 
90 
0 
2 
4 
8 


7 
5 
1 
9 


90 
70.5 


Percent 
abundance 
T1™ 


18.0 





TI rsonance 





29.5 48 
47.9 33 
86.0 14 

11 








Gaussian and 0 for a Lorentzian type of curve. If an 
effective second moment for the Lorentzian is identified 
with the square of the half-width at half-maximum 
absorption, the ratio would be 1.15. It is seen that 
qualitatively the character of the line shape changes 
from Gaussian to Lorentzian, as the abundance of the 
unlike species is reduced. Of course, the table does not 
give an indication of the asymmetry discussed 
previously. 

Data for the 52 percent-48 percent composition at 
low fields are not listed. In this case the Tl? and Ti? 
resonances are not entirely separated, as the exchange 
interaction becomes comparable to the energy difference 
between the unperturbed Tl" and Tl” resonance. The 
integrated derivative curve gives the absorption line 
shown in Fig. 3. It should be noted that as the two 
resonances begin to coalesce, the resonance maxima 


3G. E. Pake and E. M. Purcell, Phys. Rev. 74, 1184 (1948). 
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Ho= 5560 
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10 Ke/sec -20 -l0 0 10 20kKc/ec 


Fic. 2. Integrated line shapes of the T]**® resonance in T1.0; 
with 98.7 percent Tl at two different field strengths. The dotted 
curve (a) is the theoretical shape for powders with axial sym- 
metry and (b) for lower symmetry. 25 percent of curve (a)+75 
percent of curve (b) with the addition of a small amount of di- 
polar broadening should be compared with the full-drawn ex- 
perimental curve. 








i 
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pull inward from their unperturbed positions. The dis- 
tance between the maxima is reduced by 21 percent in 
this case. 

Turning to the corresponding results in the metal, it 
may be remarked at the outset that these are quali- 
tatively similar to those in the oxide, although there 
are important quantitative differences. The data in 
Table III were collected from a set of recordings which 
are not reproduced, but are similar to those for the 
oxide shown in Fig. 1. 

The lines are broader than in the corresponding oxide 
samples, indicating a larger exchange interaction. The 
lines again change gradually from approximately 
Gaussian to Lorentzian character as the unlike isotope 
concentration decreases. Again the case of 98.7 percent 
pure T]™ is of special interest. No pronounced exchange 
narrowing is observed. On the contrary, the line is 
about seven times as broad as the dipolar interaction 


Fat 


ort T 
re 
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ARBITRARY UNITS 
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a 








8045 Mc/sec 8.124 Mc/secs 
2 = 7) 2 





Fic. 3. The Tl and TI” resonance in Tl,0; in a low magnetic 
field. The integrated experimental line shape is given by the full- 
drawn curve. The two resonances have shifted from the unper- 
turbed positions at +1 on the horizontal axis and begin to merge. 
Dotted curves (a) and (b) are two theoretical shapes, represented 
by Eqs. (23) and (25), respectively. 
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TABLE III. Line width of nuclear magnetic resonance 
in metallic thallium at 77°K. 








TI resonance 


Avmal 
kc/sec 


Percent 
abundance Hext in Avmat 
T1™5 oersted [(Av2) ay]? 

8. 5560 20 17 12 

5560 23 20 4 
5560 33 24.2 A 

5560 54 28.5 9 

5560 >60 33 > 


3288 16.7 15 
3288 19 16.8 
3288 33 24.2 


[(A»*) ay]? 
ke/sec 





98.7 
90.5 
70.5 
52.1 
14.0 
98.7 
90.5 


70.5 


Percent 














for the nearly pure isotope would lead to predict. This 
time, however, the width is not much dependent on the 
field strength and the asymmetry is not pronounced. 
This is shown clearly by the experimental derivative 
curves, reproduced in Fig. 4. Perhaps the slight asym- 
metry in the higher field and the small field dependence 
could be attributed to the anisotropy of Knight shift 
in the hexagonal metal. Each thallium atom has six 
neighbors at 3.401 A and six neighbors at 3.450 A in the 
almost ideally close-packed structure. The major cause 
for the line broadening, however, must be found 
elsewhere. It will be interpreted in terms of a pseudo- 
dipolar or tensor-exchange interaction. 

The lines in the lower field in thallium of natural 
abundance begin to overlap appreciably but even the 
30 percent TI line is observable. In the 50 percent- 
50 percent composition, however, no resonance was 
observed. Apparently the exchange interaction has the 
same magnitude as the unperturbed splitting. The 
process of which the onset was shown in Fig. 3 for the 
oxide, is so advanced in the metal, that an extremely 
broad unobservable structure results. The interesting 
effects which may occur at still lower fields will be dis- 
cussed in the next section. A complete merger of the 
two lines may be expected for fields less than 10° 
oersteds. One trial run in the metal of natural abundance 
at H=1600 oersteds has been made. It would be neces- 
sary to cool to liquid helium temperatures to determine 
the structure with any precision. 

This qualitative description of the experimental re- 


sults has shown what factors should be considered in: 


the theory of magnetic line broadening in these pow- 
dered thallium and thallium oxide samples. They are: 
ordinary dipolar interaction, pseudo-dipolar interaction 
and nuclear spin exchange between like and unlike pairs 
of nuclear spins, and anisotropy of the chemical or 
Knight shift. The influence of the finite spin-lattice 
relaxation time can be neglected. 
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Ill. PHENOMENOLOGICAL THEORY OF LINE 
BROADENING AND LINE SHAPE 


(a) Second Moment Calculation 


The basis for a quantitative interpretation of the 
results is Van Vleck’s theory” of magnetic resonance line 
broadening. The Hamiltonian for a system with two 
magnetic ingredients is, in Van Vleck’s notation, 


H=Het+HextKaip, (1) 
H,= g8H DY Ie; +2'8H DY Lex, (1a) 
7 k’ 


Keoh=D Asli + DO Avvly ly 
i>j k’>l’ 
+3 Ay lj-Tv, (1b) 
ik! 


aip=>d (96 7 *+B L377 (1-24) (I-84) ] 


i>j 


4D (ge'Brie P+ Be) Oj Le — 37 
ik 
X (Uti) eye) J+ (ere + Bev) 
b’>7’ 


x (I. : I, = 3rev (Le ; Ty1) (Iv . rev) |. 


In our case the unprimed symbols may refer to Tl?” 
and the primed symbols to TI. 3C, is the Zeeman 
energy in the magnetic field H. The exchange and di- 
polar interactions are separated into pairs of primed 
and unprimed variety alone, and into mixed pairs. 
The B’s represent the pseudo-dipolar interaction. The 
mean square absorption frequency of the unprimed 
resonance is given by 


28 )w= SPP (I+NSD Ba? 
+31’ (I +OX Cx, (2) 


By=—F(Bijt+e6'rsz) (3 cosi—1), (2a) 
Cr =A we — (Bier + ge’ Brin) (3 cos’0ix7—1), (2b) 


where 6; is the angle between r and H. The exchange 
interaction between equivalent spins does not con- 
tribute to the second moment. Certain simplifying 
assumptions will now be made to bring (2) into a form 
which will permit direct comparison with the experi- 
mental results. 

In the powders the contribution of each spin pair has 
to be averaged over all angles 0. Using 


{3 cos’@—1)w=0, (3 cos’?@—1)?)w=#, 


it is seen that the exchange and the dipolar interaction 
between unlike spins contribute independently to the 
second moment. This result would still be approxi- 
mately true in many single crystals when each spherical 
shell contains a sizeable number of atoms. 

In powders the effective field H may depend on the 
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orientation of the crystallite, 
H= (1¢+ 5?) : Hxt, 


where 1? is the unit dyadic and 8 is the chemical shift 
tensor, which has a fixed relation to the crystallographic 
axes. Since |8|<1, the omission of terms quadratic in 
|5| leads to 


|H| as | Hext| {1+6;.+6.x(3 cos?@— 1) 
+dasym sin’ cos29}, (3) 


where @ and ¢ specify the orientation of H with respect 
to the chemical shift ellipsoid. The chemical shifts in 
the three principal directions are 

(v1— vo) v0 = Av /vo= bis t+26ax, 

(v2— Vo) vo? = Av2/vo= dis— bax+Sasym, (4) 


(v3— Vo) vo = Av3/v9= bis— 8.2— Sasym, 


g 


IN KC/SEC 


‘MODULATION SWING 3.6 


Fic. 4. Experimental recordings of the derivative of the nuclear magnetic resonance absorption in 98.7 percent Tl”® metallic thallium at two different field strengths. 
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dis is the relative isotropic shift. For positions of axial 
symmetry dasym=0, for positions of cubic symmetry 
5ax=0. The terms in 64x and dasym will make a contribu- 
tion to the second moment, which is proportional to 
the square of the resonance frequency vo and inde- 
pendent of the dipolar and exchange contributions. 

A priori we have no knowledge of the quantities A 
and B which have been introduced in a strictly phe- 
nomenological way. Anticipating results of the final 
sections, it is assumed that their magnitude decreases 
very rapidly with distance between the two components 
of a pair. In fact they decrease exponentially with r in 
an insulator, and they decrease as rapidly as the dipolar 
interaction in the metal, or proportional to r-*. We 
shall therefore restrict the sums to the twelve nearest 
neighbors (z=12) in both the metal and the oxide. In 
the former the omission of further lattice sites will 
introduce an error of about 20 percent as a quantitative 
evaluation of the complete pure dipolar sum for the 
thallium lattice shows. The error in the oxide for the 
exchange and pseudo-dipolar contributions is much 
smaller than this. 

It will also become clear from the atomic theory of 
exchange between a nuclear spin pair, that the inter- 
actions are proportional to the g-factors of the two 
nuclei: 
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A j= ge A n= ge Av, 
Bij= ge’ "Bir = Bg! Bry. 


AT 77°K 


Since (g—g’)/g~1 percent, it is assumed gg’'=1. 
Furthermore it is assumed that all twelve nearest 
neighbors are equivalent which is true for a fcc lattice, 
but is an approximation for the actual thallium and 
T1,03 lattices. In taking the numerical values of g and 
g’ equal for computational purposes, it should be kept 
in mind that their difference, however small, is re- 
sponsible for the important distinction between like 
and unlike spins. 

If two different isotopes happened to have the same 
g-value, they should be considered as equivalent neigh- 
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bors in the context of this paper and exchange broaden- 
ing would not occur. It cannot be overemphasized that 
the Pauli exclusion principle between identical particles 
and related concepts do not enter into the discussion of 
nuclear spin exchange. The spin orientation only of the 
nuclei, whose orbits never overlap, is exchanged via 
intermediate electrons. 

Introducing the relative concentration f for Tl** and 
i—f for Tl** in the completely random mixture of 
isotopes, the second moment expression (2) for the 
TI*® resonance with respect to its center of gravity, 


Ve= gBh Hext(1+6is), (5) 


becomes for z= 12 nearest neighbors at a distance a, 


4 4 
(ar = eI wt tune [5/20 


{2 +2a-n fat ‘say (6) 
sf : " "rid hob 


This formula will be used to interpret the observed 
values of the second moment. 


(b) Line Shape and Fourth Moment 


Van Vleck has given a complete expression for the 
fourth moment, but this quantity cannot be determined 
experimentally with any precision. The following dis- 
cussion of two limiting cases proceeds along the same 
lines as Kittel’s argument." . 

Extreme exchange broadening, | #4 |>>| B+¢°6’a|. 
The dominant terms in the fourth moment are those in 
A‘, With the same simplifying assumptions this domi- 
nant term is found from Van Vleck’s general expression 
to be 

(Av) (9/4) (Sf+11f2) A, (7) 


Using only the dominant term 3fA? in the expression 
for the second moment, one obtains 


[(Av*) a 2/[(Av*)y P= 1.41 for 
=1.51 for 
=1.66 for 
=1.94 for 


f=l 
f=0.5 

f=0.25 
f=0.1. 


The ratio would be 1.32 for a Gaussian distribution. 
For small f the fourth moment tends to dominate the 
square of the second moment. This is the dilution effect, 
discussed by Abraham and Kittel. For f=0.1, however, 
the dipolar terms become of equal importance with the 
exchange broadening terms, and a more potent narrow- 
ing mechanism will take over. Exchange narrowing domi- 
nant, | fi4|<«|B+g7%a-|, but |A|>|B+g°6%a|. 
This, limiting case is identical with the exchange nar- 
rowing for one isotopic constituent considered by 


4 C, Kittel and E. Abrahams, Phys. Rev. 90, 238 (1953); M. A. 
Ruderman and C. Kittel, reference 6. 
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Van Vleck, 


[(A wt B+g'6%a-* 


For extreme narrowing, when this ratio is large com- 
pared to unity, the line assumes a Lorentzian shape 
near the center, 


f(r) = 14+ (0 9)? Aver }4, 
with an effective half-width 





(8) 


h{Av*)ny 


Avett=3V3 AY mst= . 
ztAn 


(9) 


If this Lorentzian is cut off at a frequency 
Veoo=A(r/2)z1 Ah, (10) 


the correct value for the second moment is obtained. 
The numerical constant d is of the order of unity. Its 
actual value depends on the detailed model of exchange 
narrowing.!® 


(c) Line Shape and Anisotropy of the Chemical Shift 


It was already mentioned that the line shape in the 
oxide for fK1 is predominantly determined by the 
anisotropy of the shift rather than the exchange narrow- 
ing. The problem of the line shape in powder samples 
due to anisotropy in the resonance condition has al- 
ready been solved elsewhere by the authors.' The same 
solution applies to powder line patterns in the presence 
of anisotropic g-factors and quadrupole broadening in 
fields of, nonaxial symmetry. Since an incorrect line 
shape has been published in this Journal,'* the deriva- 
tion will be outlined briefly, in a slightly improved form. 
The resonance frequency is given by 


v= v7? cos’0+ pv,” sin’6 cos’*6+- v3" sin’é sin’¢. 


(11) 


Without loss of generality it is assumed that 11> v2> 3, 
and these quantities are given by Eq. (4). The polar 
angles 6 and ¢ determine the direction of the field H 
with respect to the axes of the resonance frequency 
ellipsoid. When the differences between 7, v2, and »; 
are small, Eq. (11) reduces to 


v=v, Cos’#+ v2 sin’?@ cos’*6+3 sin?6 sin’. (12) 


Transform from the variables @ and ¢, to the set v and 
6=6’. The line shape in a powder is given by 


0(¢,0 
I(v)= f 0) sindd@. 
8(v,9) 
real values 


The limits of integration are determined by the 
condition of physical reality. The integration is to be 
extended over all real values of 6 which are compatible 
with a given real value of v. The final result can be 


15 P. W. Anderson, J. Phys. Soc. Japan 9, 316 (1954). 
16 C, Kikuchi and V. W. Cohen, Phys. Rev. 93, 394 (1954). 
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brought into the form of the tabulated!’ complete 
elliptic integral K(sina). The normalized line shape 
functions are 


f(v) =a (v— v3)—* (v1 — v2) 1K (sina) 


with (13) 


sin’a= (ve— v3) (v1— v) / (v1 — v2) (v— v3) for 15> v> v2, 


f(v) =a (1) — v)“} (ve— v3) 4K (sina) 
with 


sin’a= (v— v3) (vi— v2) / (vi— v) (veo— v3) for ve>v> v3, 
f(v)=0 


For axial symmetry v3= ve, the shape function reduces 
properly to 
fax(¥) = 3 (v1— v2) 4(v— v9)“. 


These theoretical shapes are shown as dotted curves in 
Fig. 2. ‘ 

Equations (13) and (14) have to be replaced by more 
complicated expressions, if the relative shifts are not 
small. Equation (11) has to be used instead of (12). 
To obtain the distribution of nuclei over the resonant 
frequencies v, every frequency in (13) and (14) has to 
be replaced by its square and a factor 2v must be 
added. The final expression for the intensity distribu- 
tion should also take into account that the radio-fre- 
quency field is not quite perpendicular to the internal 
field, but makes an angle x. For each crystalline orienta- 
tion one has to multiply by sin?x, which may be ex- 
pressed in terms of @ and ¢. For large axial anisotropy, 
Eq. (14) has to be replaced by the powder line shape 
function, 


fax(v) =v (vr? — v2?) 4 (P— v2?)-4 


(?— v2?) (v?—v*) 
x(1- 2v?(v1-+ 2)? ) “= 


for v>v; or v< 13. 


(14) 





(d) Line Shape in Small External Fields 


Finally, we turn to the question of the line shape in 
small external fields, when the exchange interaction 
becomes comparable with the difference of the un- 
perturbed energies. Van Vleck’s expressions do not 
apply to this case, as the Hamiltonian has to be trun- 
cated in a different manner, when A> (g—g’)6H. 
The exchange between like spins commutes with the 
total Zeeman energy, but the exchange between unlike 
pairs does not so commute. The two spins in a pair 
would be considered as nonequivalent if g:Hix~goHo. 
If the exchange energy between such a pair of nuclear 
spins with different g-values in the same field or the 
same g-value in different fields is small, the problem can 
be treated by a perturbation procedure, i.e., the Hamil- 
tonian can be truncated in the usual manner described 
in Van Vleck’s paper. 


_ TE. Jahnke and F. Emde, Tables of Functions (Dover Publica- 
tions, Inc., New York, 1945), pp. 52-85. 
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The problem of large exchange between nonequiva- 
lent spins is a difficult one. In the limit of very large 
exchange one could follow Pryce'* and consider the 
difference in the spins as a small perturbation. The 
Hamiltonian, from which we omit the dipolar inter- 
actions to avoid lengthy computations, can be split in 
the following manner: 


H.+KHx=BLfgpHt+ (1 —f)g'H'] (Vlas tdD wel ck) + Hex 
+6(gH—g'A')((i-f\Udi—f Lvl]. (16) 


The interaction with the radio-frequency field was 
formerly split into the contributions of two different 
species. This should also be changed: 


BH rt(g Lilasts’ Del 2x’) 
=BA {3 (gt+g') (Slee t Del xx’)} 
+(38H it (g—g") (ST es—DeeT cx’) J. (17) 


The operator between square brackets will always 
change the sign of the exchange energy between unlike 
spins. It produces satellite absorption at frequencies 
of the magnitude of the exchange frequency and will 
henceforth be omitted. The commutator of the first 
term on the left-hand side of (17) with the Hamiltonian 
(16) can now be calculated, and expressions for the 
second and fourth moments derived, using Van Vleck’s 
Eqs. (7-10) and (17): 


(Av? )w= (gH — g’H')*f(1—f). (18) 


Here Av=v—y,, where the centroid frequency is given 
by 

ve=hB{ fg + (1—f)¢'H’). (19) 
In writing the Hamiltonian (16), this result was 
anticipated so that (Av*),, is minimized. The simplifica- 
tion that all spins have the same total angular mo- 
mentum, J, has been made. The expression for the 
fourth moment becomes 


(Av!) y= 1-484 (gH — 2 H'){ f4(1—f) + (1-f)4f} 
+ (gH—g'H’)?A?f(1—f) 237 T+1), 
(Av) y= (v4) w— vt— 6 2(Av*) ny 
— 46° (gH — g'H")*v-f(1—f)(1—2f). 


It has been assumed that the exchange interaction has 
the value A for z nearest neighbors, and vanishes for 
all other pairs. The last term gives a measure for the 
asymmetry of the absorption around the centroid. It 
vanishes properly when f=}, since a symmetrical ab- 
sorption pattern should result, when the two isotopes 
occur in equal concentration. 

The dominant term is the second term on the right- 
hand side and the fourth moment is seen to be larger 
than the square of the second moment by a factor of 
the order 2A?/(gH—g’H’)’. In analogy with the dipolar 
exchange narrowing, one may interpret this that there 
is a narrow line at the centroid position with an effective 


(20) 


18M. H. L. Pryce, Nature 162, 538 (1948). 
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width (gH—g’H’)?/Az}. In the simple case of just 
two spins it is easily verified that in the limit of large 
exchange the absorption spectrum consists of a pair of 
lines, each a distance (gH—g’H’)*/4A from the central 
position. 

Considerable care should however be exercised, as a 
finite absorption over a region (gH—g’H’)h—8 cannot 
generally be excluded by the calculation of the moments 
alone. The example of the case of three spins, two of 
which are identical with gyromagnetic ratio g:1= g-=g 
while the third one has g;, illustrates this. A complete 
quantum mechanical solution can be given in this case. 
Our interest is in the situation for very large exchange 
energy between the unlike spins A(I,+I.)-I;. For 
A-—o, the absorption spectrum consists of three 
lines, one at the average position (3g+323)h—"8H and 
two satellites symmetrically located at [(4/3)g—3¢: ] 
Xh-'8H and g;k—'6H. The intensities of these lines are 
in the ratio 4:90:9. Thus no definite conclusions can 
be drawn from the calculation of second and fourth 
moments alone, unless the truncation process of Eqs. 
(16) and (17) can be refined. This is even more true if 
the exchange interaction is not large, but comparable 
to the difference in Zeeman energies. To obtain a semi- 
quantitative idea what can be expected in this case, 
refuge is taken to less rigorous, approximate methods. 

A well known approximation in exchange problems 
is the introduction of an effective field. Consider the 
resultant magnetization M of the unprimed variety 
and M’ of the primed variety. In the absence of exchange 
between mixed pairs these will precess independently 
in the magnetic field H. Thé exchange coupling be- 
tween M and M’ is taken into account by an effective 
field \M’= A M’/g,2.6? acting on M, and AM acting on 
M’. The equations of motion for the two spin systems 
become 


dM/di=h-g8{M X (H+AM’)}, 
dM’/dt=h-'ge’{M’X (H’+2M)}, 
with the two eigenfrequencies 
v=} (gH+g'H’)Bh"+38h"(¢'M.+gM,’) 
36h" (gH — g'H’)?+2(gH — g'H’) (gM .'—g'M.)d 
+)°(¢'M.+gM,')?}'. (21) 
This approach has been used frequently to discuss the 
ferromagnetic resonance of the two spin systems in 


ferrites.” In the case of large exchange it gives the 
correct resonance condition at the “average” frequency : 


v={feH+(i—f)g’H'}6i-, 


f=g'M.(g'M.+M,')". 


Above the Curie point the omission of the fluctuations 
in the exchange fields, i.e., of terms quadratic in the 
transverse components of the spins, is however very 


with 


19 See, e.g., R. K. Wangsness, Phys. Rev. 91, 1085 (1953). 
Further references to the literature are given in this paper. 
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serious. The solution (21) gives erroneous results for 
the nuclear two-spin system, where the average ex- 
change fields are small compared to the fluctuations. 

In order to obtain a better idea about the change in 
absorption for intermediate values of the nuclear spin 
exchange, it is assumed that the system can be repre- 
sented by a model of a rotating magnetization which 
changes at random intervals from the frequency » 
= gHBh- to the frequency vo’ = g’H’Bh— and vice versa. 
The magnetization is a continuous function of time. 
The fluctuating exchange fields are responsible for the 
transitions which have a probability per unit time of 
the order of the exchange frequency. Such a model has 
been discussed by Slichter,?° by Archer” and by An- 
derson'® with three entirely different mathematical 
methods. They all discuss the case that the oscillator or 
rotating magnetization spends equal time at the two 
frequencies, and the transition probability to jump from 
vo to vo’ or vo’ to vo is the same. Anderson and Archer 
derive the spectral distribution from the correlation 
function for the freely radiating oscillator, whereas 
Slichter starts from the magnetization driven by an 
external harmonic field. Slichter’s method is by far the 
simplest and covers in addition the case that the oscilla- 
tor is subjected to other external damping mechanisms 
at each of the two frequencies. Furthermore his method 
can be readily extended™ to the situation that the 
oscillator spends a fraction f of the time at frequency » 
and a fraction f’=1—/ at the frequency »’, correspond- 
ing to the fractions of the primed and unprimed isotope. 
The transition probability for a frequency jump is 
assumed to be proportional to the probability f or 
f’ of the oscillator frequency after the transition: 


ri=|Alhzi(1-f), 

r= |A lh f. 
Unfortunately the numerical factor cannot be derived 
in a rigorous manner and is assumed to be unity. 


The generalization of Slichter’s equation for the 
absorption curve is 


tr! + (r3+7'7)1"(er'+0'7) 
g(v)<Re ? 
(1+-e’r’)(1+a7)—1 


(22) 





(23) 


with 
a=Ts"—2ri(v—), 


a’ =T./——2ri(vy—’). 


(24) 


In the limit of large exchange 7 — 0, a single narrow 
resonance at the average frequency results. For 7— © 
two separate resonances occur. 7'3~! and T,’~' represent 
the widths the separate resonances would have, if 
there were no frequency jumps. 72 and 7 should be of 
the same order of magnitude, but the ratio is not deter- 


2 Gutowski, McCall, and Slichter, reference 4. [Eq. (41).] 
21 —. H. Archer, thesis, Harvard University, 1953 (unpublished). 
” H. S. Gutowski and A. Saika, J. Chem. Phys. 21, 1688 (1953). 
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mined in this model. It is believed that Eqs. (22)-(24) 
give a fairly reliable description for intermediate values 
of the exchange coupling, although the model is strictly 
not applicable. For T=-'=7,'"=0 and r=71’ Eq. (23) 
reduces to the shape function of Archer-Anderson: 


weno” 


wt 2*(2v2—wae?) Hero! 





I (w)dw= (25) 
with 
Q= 2aLv— 3 (vo+ vo) |, 
woo= 2} (vo— v0’), 
We= 77), 


IV. COMPARISON OF THEORY AND EXPERIMENT 


With the aid of the formulas developed in the pre- 
ceding sections the experimental results may now be 
used to obtain numerical values for the phenomeno- 
logical quantities introduced. 

The best starting point for the discussion is the ob- 
servation of the second moment in the nearly pure 
isotope, 98.7 percent Tl. One can put f=0 in Eq. (6). 
Minor corrections because f=0.013 can be made after- 
wards. The second moment consists of a field-inde- 
pendent contribution from the dipolar or pseudo- 
dipolar broadening and a contribution from the aniso- 
tropy which is proportional to the square of the field. 
These two contributions can be .separated because 
observations at two different field strengths are avail- 
able. It turns out that for the oxide the anisotropy is so 
much larger than the dipolar contribution, that only 
an upper limit can be given for the magnitude of the 
latter. In the metal the pseudo-dipolar interaction is 
dominant, but an anisotropy of the Knight shift is still 
noticeable (compare Fig. 4). The anisotropy broadening 
is not affected by exchange, but the dipolar part will 
be exchange-narrowed. In order to obtain the correct 
contribution to the second moment the line must be 
integrated from the center to a distance of the order 
of the exchange frequency, Eq. (10), or to about 40 
kc/sec in our samples. It is difficult to obtain reliable 
values for the contributions from the tails and the 
pseudo-dipolar second moment may be somewhat 
underestimated. Fortunately the exchange narrowing 
is not severe, especially not in the metal, where the 
effective “narrowing factor” Eq. (8) is about two. This 
figure is estimated from the ratio of experimental 
widths of the Tl resonance in 98.7 percent and 14 
percent T]> sample. If there were no pseudo-dipolar 
interaction, the narrowing factor would have been 14 
and the line shape would have been determined entirely 
by the anisotropy. The observed second moment in the 
50 percent composition serves to determine the exchange 
constant A, which is the remaining unknown parameter 
in Eq. (6). Then this equation can be used to calculate 
the second moment for the Tl” and T]”®.resonances in 
all other compositions and at other field strengths. The 
results for the oxide are compiled in Table IV, for the 
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TaBLeE IV. Contributions to the second moment of the Tl 
and TI*®> resonances in T],0;3. All (Av*)y contributions are in 
units (kc/sec)*. 











Dipolar Classi- 

Magnetic and  Aniso- cal di- 

Abundance field in pseudo- tropic Ex- Total Ob- polar 

(percent) gauss dipolar shift change theory served alone 
98.7 T1% 5560 <9 60 5.4 70 70 4.5 
98.7 T125 3288 <9 22 5.4 32 35 4.5 
90.5 T12%5 5560 <8.6 60 39 104 91 4.3 
90.5 T1205 3288 <8.6 22 39 66 54 4.3 
70.5 T1205 5560 <7.6 60 121 185 196 3.8 
52.1 T1208 5560 <6.6 60 220 285 290 3.3 
14.0 T1205 5560 <4.6 60 353 415 >400 2.3 
86 T1% 5560 <8.2 60 65 130 118 41 
47.9 Tim 5560 <6.6 60 240 305 310 3.3 
29.5 T1298 5560 <5.0 60 320 383 360 2.5 








metal in Table V. It is seen that very good agreement 
with experimental values in the next to last column is 
obtained. For purposes of comparison the last column 
contains the contribution from the classical dipolar 
interaction alone, which thus far was usually considered 
the important broadening agent in solids. Its inadequacy 
in the present case is striking. 

The numerical values for the exchange interactions 
between nearest neighbors in the oxide can be found 
from Table IV and Eq. (6). They are 


|A|h-'=12 kc/sec, —2.2<Bh<0.35 kc/sec. 


This latter ambiguity arises from the fact that there 
may be constructive or destructive interference with 
the ordinary dipolar interaction. Neither can the sign 
of A be determined by these experiments. An inde- 
pendent determination of the exchange interaction is 
possible from the inward shift of the resonance at lower 
field strength with the aid of Eqs. (23) or (25). In Fig. 3 
the integrated experimental line is compared with these 
theoretical expressions. The dotted curve (a) corre- 
sponds to Eq. (23), and the dotted curve (b) to Eq. 
(25). The distance between the maxima of absorption 
is reduced to 79 percent of the distance of the unper- 
turbed Tl and Tl resonances. This determines r. 
If one takes rr(vo’— vo) =2.38, the theoretical curves 
give the correct position of the maxima. With Eq. (22) 
this value of 7 leads to | A | -!=9.6 kc/sec in fair agree- 
ment with the value derived from the second moment. 
The discrepancy is undoubtedly due to the uncertainty 
in the numerical factors. The value | A|-!=12 kc/sec 


TABLE V. Contributions to the second moment of the T]* 
resonance in thallium at 77°K. All contributions to (Av*)y are in 
units (kc/sec)?. 











Dipolar 
Magnetic and  Aniso- Classical 
Abundance field in pseudo- tropic Ex- Total Ob-_ dipolar 
(percent) gauss dipolar shift change theory served alone 

98.7 T12%5 5560 200 100 14 314 300 6.2 
3288 200 35 14 249 220 6.2 
90.5 T125 5560 189 100 102 391 395 5.9 
3288 189 35 102 326 280 5.9 
70.5 T1295 5560 170 100 312 592 590 5.3 
3288 170 35 312 517 590 5.3 
52.1 T1205 5560 148 100 570 818 820 4.6 
14.0 T1% 5560 105 100 910 1105 ? 3.3 
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is more reliable and should be retained. Curve (a) gives 
much better agreement with experiment, since it in- 
cludes the additional broadening T;“', not associated 
with frequency transitions. The very good fit was ob- 
tained by taking r/T2=0.6. 

To evaluate the exchange constants for nearest 
neighbors in the metal, twenty percent is subtracted 
from the contribution of the exchange and dipolar 
broadening to account approximately for the effect of 
farther neighbors. Then Eq. (6) gives the following 
values | A | =17.5 kc/sec and B=+4.5 or —6.5 kc/sec, 
as compared to the classical dipolar constant g°6?a*= 1 
kc/sec. At the lower field strength in the 50 percent- 
50 percent composition of the metal rr(vo’— vo) = 1.42, 
which results in a broad unobservable line shape. The 
anisotropy of the shift in the hexagona] metal amounts 
to dax=0.08 percent. Consequently, the anisotropy in 
the Knight shift 


(Av,,—Av,)/Avis=0.24 percent/1.56 percent 
= 16 percent. 


The evaluation of the anisotropy constants for the 
shift in the oxide is much more complicated. In the 
powder specimen one is clearly not concerned with the 
orientation of the shift ellipsoids in the crystal. Only 
the axial ratios of the ellipsoids are important, but the 
oxide has two different types of thallium atoms in the 
unit cell. Eight atoms on the body diagonal have 
ellipsoids with rotational symmetry and require one 
anisotropy constant. Twenty-four atoms of lower sym- 
metry have other ellipsoids, all with the same axial 
ratios. This requires two additional constants. Further- 
more it is conceivable that the isotropic shift of the 
atoms of first kind is not the same as for the atoms of 
the second kind. This would lead to an additional 
contribution to the second moment, which has not been 
considered before in this paper. Clearly the experi- 
mental curve of Fig. 2 cannot yield these four inde- 
pendent constants. Experiments at much higher field 
strengths would be necessary to give a better deter- 
mination of the line shape. Data on single crystals 
would hardly be more revealing. For an arbitrary 
orientation of the external field there would be sixteen 
different resonance curves! This difficulty is inherent 
in the large number of atoms in the unit cell. 

We have tried to fit the experimental curve as well 
as possible with a line shape of axial symmetry alone 
[Eq. (14), dotted curve in Fig. 2] and separately for a 
line shape for less than axial symmetry [Eq. (13), 
dashed curve in Fig. 2]. It is seen that 75 percent of the 
latter curve and 25 percent of the former gives an ex- 
tremely good fit with the experimental shape. In the 
case of axial symmetry the anisotropy parameter is 
determined by the second moment alone 6,x.=0.063 
percent. This gives an anisotropy of the chemical shift 


(Avy—Av,)/Avis=0.19/0.55= 34 percent. 
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In the case of lower symmetry the two parameters are 
determined by the second moment and the shift of the 
center of gravity of the line with respect to the maxi- 
mum using (4) and (6) one finds 6,.=0.061 percent 
and dasym=0.03 percent. The uncertainty in these 
numbers is rather large, especially in dasym. Yet it can 
be said that conclusive evidence for the anisotropy of a 
chemical shift has been found, and that the anisotropy 
has some nonaxial symmetry. 

The influence of exchange between the TI nuclei 
with different chemical shifts has been ignored in this 
discussion. If the exchange interaction were very large 
compared to the difference in chemical shifts an average 
resonance frequency for all thallium atoms in the unit 
cell would be observed. Since the unit cell is cubic, this 
average would show no anisotropy. The experimental 
results indicate that the exchange interaction does not 
produce such a complete averaging. Its influence at 
the higher magnetic field is probably small, but at the 
lower field the exchange energy and the difference in 
chemical shifts are comparable. It seems a hopeless 
task to treat the exchange interaction between the 
various nonequivalent Tl nuclei in the unit cell 
adequately. This points again to the desirability to in- 
vestigate the anisotropy of the chemical shift in simpler 
structures. 

It has been shown that all experimental data can be 
interpreted satisfactorily in terms of the phenomeno- 
logical theory. This interpretation indicates that the 
discussion of the Knight shift in thallium and its alloys 
as given in a previous paper (reference 1) remains essen- 
tially unchanged. The broadening of the line in alloys 
may in part be due to exchange between unlike nuclei. 
The shifts at the lower field strength reported in refer- 
ence 1, are not reliable, however, because of the co- 
alescence of the two thallium resonances at low fields. 

The gradual change in shape from Gaussian to 
Lorentzian with increasing concentration of like neigh- 
bors has been discussed before. It is in semiquantitative 
agreement with Eqs. (7), (8), and (9). It remains to be 
shown that the numerical values, found for the ex- 
change constants A and B, are reasonable ones in terms 
of an atomistic theory. 


V. ATOMISTIC THEORY OF NUCLEAR SPIN EXCHANGE 
COUPLING IN SOLIDS 


Ramsey and Purcell have shown that the nuclear 
spin-electron spin interaction will give rise in second- 
order perturbation theory to an exchange-type of 
coupling between nuclear spins in molecules. Ramsey’s 
theory for this type of interaction in molecules can 
readily be extended to periodic lattices. Ruderman and 
Kittel have already given the extension for the nuclear 
spin coupling by the conduction electrons in a metal. 
The analogous process for coupling in an insulator via 
excited electron states will be derived here along similar 
lines, 
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The electrons in the solid may be adequately de- 
scribed by one-electron wave functions of the Bloch 
type,” normalized over a large volume V: 


Yi =e, (4), vf ntadr= A. (26) 


u(r) has the periodicity of the lattice. Exchange and 
correlation effects are neglected in this treatment. 

A pair of nuclear spins is introduced at positions R; 
and R; in the lattice. The perturbation of the wave 
function by the nuclear spin-electron spin interaction 
is considered, 


167 
Krgs=— Benbuli: Si (r;)+28gnBn 


xI;- { Sr 3—3r-*(S- r;)r;}, (27) 
where r;=r—R; is the radius vector from the nucleus i 
to the electron. The Bohr magneton £ has a negative 
value. The nuclear spin 7 produces a similar interaction. 

The second term represents the classical dipolar 
interaction between two dipoles. It was omitted in 
Kittel’s paper. It will be shown to be responsible for 
the pseudo-dipolar interaction in the next section. For 
the time being consider only the first term. When the 
wave function has a finite value at the nucleus, there is 
an interaction corresponding to the hyperfine structure 
in atomic S-states. In second-order perturbation theory 
the change in energy of the system due to the Hamil- 
tonian (27) is 


Lede Wu*s’|Rrs|Vus) Wuts|Irs| Yrs’) 
(E(k) — E(k’)}-. 


The summation is over all excited states k’ and the 
two spin orientations s’. The interest is in terms which 
depend on both nuclear spins. The transition to the 
intermediate excited state is due to spin i and the 
transition back to the ground state due to spin /, or 
the role of 7 and 7 is reversed. To obtain the total energy 
perturbation caused by the introduction of nuclear 
spins 7 and 7, the summation over all initially occupied 
states k, s has to be performed. The three-dimensional 
6 function in (27) makes the evaluation of the matrix 
elements simple. The summation over the two electron 
spin orientations in the initial and intermediate states 
can also be performed readily. The integration over the 
volume V cancels the normalization factor in (26) and 
the result is 


Ka I,- 1{2 Redo «> 0 A:(k,k’) 
KA; (hi ke HE) BBL E(k) —E(k’)}-], (28) 
% See, e.g., F. Seitz, Modern Theory of Solids (McGraw-Hill 


Book Company, Inc., New York, 1940), p. 348 ff. 
* Tn subscripts k and k’ will often stand for k and k’. 


with 
Ai(k,k’)=Ai*(k’,R) 
= (1671/3) gnBwBux(R;)ux*(R,). 


The complex conjugate is added because the role of 7 
and j can be interchanged, resulting in twice the real 
part. The expression between square brackets in (28) 
represents the previously introduced quantity A;; and 
is generally valid for all periodic lattices. The difference, 
e.g., between metals and insulators becomes apparent 
in the evaluation of the sums over all occupied initial 
states k and all unoccupied intermediate states k’. To 
gain insight into the magnitude of the quantity A,; some 
simplifying assumptions about the k-dependence of the 
integrand will be made. These restrictions are not 
essential and many refinements on them could be 
introduced. More complicated integrals would then 
have to be evaluated. 

For the case of an insulator the assumption is made 
that the occupied band is narrow compared to the gap 
between the top of this band and the bottom of the 
conduction band, and that this gap is uniform for all 
directions E,>>E;. The energy of a conduction electron 
in this band and all higher bands is represented by 
Ey =k"h?/2m’, where k’ assumes the value 0 at the 
bottom of the band and runs to infinity, m’ is the ef- 
fective mass. Lower filled bands are assumed to have 
such a large energy difference with the conduction 
band that their contribution is neglected. With these 
approximations one has 


Ey —Ex=E,+k?h?/2m’. 
Furthermore a suitable average, 
(Ai(RR’)A;(RR’) ott = (| Asj|?)m, (29a) 


over the two bands is introduced. This is not simply 
the arithmetic mean, as the quantities are weighted 
with the inverse energy difference between the two 
states. With these simplifications the integrations over 
k and k’ can be carried out in spherical coordinates. 

If one writes R;;=R;—R,, the result of the angular 
integrations is 


ke 
A j= ( | Ay | 2) ay pm itera f sin (RR ij)kdk 
0 


(29) 


(30) 


x f sin(k’R;;)k'[k2-+2mE hk dk’. 
0 


The last integral can be evaluated by contour integra- 
tion in the complex plane and gives an exponential 
factor. The upper limit of the first integral is the top 
of the filled band. For a Wigner-Seitz sphere of atomic 
volume 4, which accommodates one electron of given 
spin orientation, one has 


ky=2n(3/4arv,)4. (31) 


The spherical approximation is clearly not valid near 
the zone boundaries, but it allows for a simple evalua- 
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tion of the integral and is consistent with earlier 
approximations. The final result is 


A j= —3(|Aij|*)w mh 4R,5-[ sin (RRiy) 
—k,Ri; cos(k.Rij) ] exp{ —h-!(2m'E,)* R33}. 


In the case of a metal there is no energy gap and a 
conduction electron near the top of the Fermi dis- 
tribution may be scattered into a state with approxi- 
mately the same energy. The integration over k is now 
from zero to km, the value of the wave number at the 
Fermi level Er. For one conduction electron per atomic 
volume 2, 


(32) 


km=3tatns*, Er=h?kn?/2m'. 


The integration of k’ is from km to infinity, to include 
all overlapping higher bands. 

There are a few scattering processes with k~k’~km, 
which give a very large contribution, because the 
energy dominator becomes very small; but these 
partially cancel. Ruderman and Kittel® have shown 
how the integrations give a finite result: 


A g=—2-*(|Agj|?) aon’ Rij th 
X [sin (2kmRij) — 2emR iz COS(2kmRi;) ]. 


Since most of the contribution comes from scattering 
processes near the Fermi level, (| A;;|?)a. in the metal is 
effectively an average over the Fermi surface. 

The main difference between a metal and an insulator 
is the exponential factor in the latter case. The exponent 
contains the square root of the energy gap. Whereas 
the magnitude of the interaction drops off as R;;* for 
the metal, like the dipolar interaction, the exponential 
decrease in the insulator makes the “nearest neighbor 
only” assumption a good one in this case. 

The quantity |A,;|? calculated from Eqs. (29) and 
(30) is related to the atomic hyperfine splitting v>!*: 


(33) 


(34) 


407k" 
[Ai5|?= virtey MEE; 


(27 +1) (27;+1) 


where £;= P;*‘f/P, is a numerical factor of the order of 
unity. P, is the electron density at the nucleus in the 
atom in the s-state with hyperfine splitting »“‘*, while 
Ps is the corresponding density in the solid, suitably 
averaged over valence and conduction bands. For a 
metal Pef is approximately equal to the density pro- 
duced at the nucleus, if all electrons were at the Fermi 


(35) 
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surface, because these electrons give the major con- f 
tribution in the integral leading to (34). The exchange 
interaction between a pair of atoms in a metal in the 
quasi-free electron approximation is proportional to 

vaé?m’, whereas the Knight shift is proportional to 
vatm’. A different combination of the quantities & and [ 
the effective mass, or £ and the density of states, occurs. F 
In principle both (v.¢) and m’ can therefore be deter- 


mined. The spin-lattice relaxation gives the same com- 


bination of — and m’, as the Knight shift. ; 

The factor v,? appears in (35) because of the nor- > 
malization condition (29). Taking R;A~v. in (34), it P 
is seen from (33) and (35) that A.; has the order of 
magnitude of the square of the hyperfine splitting over F 
the Fermi energy. 

For insulators the interaction between nearest neigh- 
bors will in general be somewhat smaller. The ratio of f 
the exchange with respect to the classical dipolar inter- f 
action increases rapidly for heavier isotopes as the > 
effective electron density Pe increases. Kittel esti-[ 
mates that this ratio is unity for silver. It is about 15 f 
for thallium. For a pair of sodium atoms the exchange f 
interaction would be roughly 5 percent of the dipolar f 
interaction, but for a sodium-thallium pair it would > 
again be of the order of unity. These qualitative esti- f 
mates hold for elements as well as compounds, for 
molecules as well as coherent matter, for metals asf 
well as insulators. In molecules with a low-lying ground f 
level and in insulators with a large forbidden gap the} 
effects will clearly be smaller, but nuclear spin exchange f 
should always give marked effects, whenever heavy 
isotopes are present. 

The sign of the exchange interaction is determined by f 
the function —sinx+- cosx with x=,R,; in insulators, f 
and x= 2k,,R;; in metals. For a b.c.c. insulator we have f 
Ai;<0 for nearest neighbors, but A,j>0O for next ff 
nearest neighbors. Since the nearest neighbor inter-f 
action dominates due to the exponential factor, thef 
nuclear magnetic moments will tend to align parallel. [ 
Other data for the f.c.c. structure and for metals are} 
compiled in Table VI. It is seen that the nuclearf 
moments in a b.c.c. metal would tend to align in twof 
antiparallel systems. In the case of metals the align-f 
ment of the nuclear spins is however also determined inf 
part by the first-order spin interaction with the con- 
duction electron spins, as discussed by Fréhlich.* This 
and the long-range character of the nuclear spin ex-f 
change makes a prediction of nuclear spin ordering inf 
metals difficult. The simpler case of insulators allows} 
the prediction that the b.c.c. and f.c.c. lattices of 
nuclear spins will show a ferromagnetic ordering at 
very low temperatures. The Curie point of the thallium 
oxide with #-'A ,;=1.2X10‘ cps and z=12 would lie at 


T= 22AI (I+1)/3k=3.46X 10° °K. (36) 


25H. Frdhlich and F. R. N. Nabarro, Proc. Roy. Soc. (London) 
175, 382 (1940). : 





NUCLEAR SPIN EXCHANGE IN SOLIDS 


Its band structure is more complicated than the present 
treatment allows for. It is only a reasonable guess that 
it will become nuclear ferromagnetic, but the occurrence 
of some type of nuclear alignment below this tempera- 
ture is certain to exist, independent of the shape of 
the specimen. The moot question of alignment by di- 


_[) polar interaction alone is not involved.” 


’ Unfortunately the relaxation times to establish ther- 
) mal equilibrium will become prohibitively long in 
) insulators at such low temperatures. Experimentally 
} metals would be more suited for the investigation of 
) nuclear magnetism at extremely low temperatures, but 
) here the theoretical situation is less certain. 

» Although the simplifications made in obtaining Eqs. 
} (32) and (34) are too crude to allow an actual calcula- 
} tion of A,; even for simpler structures than thallium 
) and thallium oxide, we wish to show that the order of 
) magnitude is in agreement with the experimental values. 
) Using the geometry of a f.c.c. lattice with 12 nearest 
) neighbors at 3.45X10-*§ cm for both metal and oxide, 
) the known atomic hyperfine splitting?’ in the 6s? 7s 
) state, s=0.4 cm™, an effective mass equal to the 
» free electron mass, one obtains for the metal h-A,; 
) =2.2 kc/sec. Agreement with experiment is obtained 
/ when =2.8. The effective electron density must be 
| taken 2.8 as large as in the atomic 6s* 7s state, or 1.5 
| times as large as in the 6s* 6p *P; state. To obtain the 
) correct Knight shift of 1.56 percent, £= 1.64 should be 
» taken. The discrepancy merely points to the inadequacy 
) of the quasi-free electron model in this case. There is 
) no justification for an attempt to derive a better value 
| for the effective mass. 

) If one assumes the same effective density and the 
) free electron mass for the oxide, Eq. (32) gives agree- 
/ ment with the experimental value /-!A ;;=12 kc/sec if 
) the energy gap is taken as H,=0.12 ev. The experi- 
) mental data and the theoretical expressions can thus 
» be made to agree with reasonable values of the physical 


_P constants. 


| The problem of crystalline anisotropy for which there 
) is no room in the spherical approximation can be dis- 
\ cussed in a formal manner by expanding the wave 
» function ¥, and the energy denominator in a series of 
) spherical harmonics consistent with the crystalline 
symmetry. For a crystal with axial symmetry one 
could, e.g., write 


I Vie= (ue-bikartr, COSP e+ + + +)ett 
X (E(k) — E(k) = E+ AE "43 c0s"xo—1) 
+AEy 3 (3 COSA Ko — 1)+ mes, 


where Ox. is the angle between the wave vector k and 
the crystallographic axis c. Repeating the calculation 
of A,; with the addition of these terms gives upon 


26 J. A. Sauer and A. N. V. Temperley, Proc. Roy. Soc. (London) 
176, 203 (1940); J. H. Van Vleck, J. Chem. Phys. 5, 320 (1937); 
| J. M. Luttinger and L. Tisza, Phys. Rev. 70, 954 (1946). 

27P, Brix and H. Kopferman, Landoli-Bornstein, Zahlenwerte 
und Funktionen I 5 (Springer, Berlin, 1952). 
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integration terms depending on the angle between R,; 
and ce. The spin dependence of the interaction still has 
the scalar product form. The magnitude of the exchange 
constant will depend somewhat on the orientation of 
R,;. It will not be possible to observe this effect with 
the techniques described in this paper. Only the average 
exchange interaction for a number of neighbors is 
measured. Therefore no explicit calculation will be 
presented. 

The pseudo-dipolar interaction does not have its 
origin in crystalline anisotropy, but in the tensor char- 
acter of the dipolar interaction of the second term in 
the Hamiltonian (27). The important differences caused 
by the angular dependence of this interaction must now 
be considered. The terminology “anisotropic exchange” 
which has often been used to denote the pseudo-dipolar 
interaction in papers on electron magnetism has been 
avoided in the present article. It is clear from this 
paragraph that its use may be misleading. 


VI. THEORY OF PSEUDO-DIPOLAR INTERACTION 


It will be convenient to write the angular dependence 
of all quantities in terms of spherical harmonics 
P;'™\(cos@), which are related to the normalized, gen- 
eralized Laplace spherical functions by 


2i+1 (l— ~T 
2 (l+|m|)! 
X P,'"! (cosd) (2r)—te™?, 





Y"=(— »| 


(37) 


When 6;, and ¢;, correspond to the angle between two 
vectors k and r, the shorthand notations **P;"= 
Py"(cosx-) and P7-"™= (—1)™[(/—m) !/ (l+-m) !]P are 
introduced. The orthogonality properties of these func- 
tions are well known.”* Frequent use will be made of 
the addition theorem, 


l 
ab P) = .¥ ; (— 1)™ ac Pm be P —me im(dab—dbo) (38) 


of its immediate consequence, 


4r 
ff abp, beP md, = (= -) acP me imdac, (39) 


where dQ, is an element of solid angle of the vector 
space b, and of the expansion formula for a plane wave 
used in most scattering problems, 


2 
efk- t= (=i 
T 


%See, eg., F. Sauter, Differentialgleichungen der Physik 
(Sammlung Goschen, Berlin, 1942); P. M. Morse and H. Fesh- 
bach, Methods of Theoretical Physics (McGraw-Hill Book Com- 
pany, Inc., New York, 1953), pp. 1274 ff; L. Pauling and E. B. 
Wilson, Introduction to Quantum Mechanics (McGraw-Hill Book 
Company, Inc., New York, 1935), Chap. 5. 


—} 0 


dX (21+-1)t'Ji44(kr) **P:. (40) 
l= 
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The Bessel function of half-integral order J1,, can be 
expressed in the cosine and sine functions and odd 
powers of (kr)~}. 

Consider the tensor part of the Hamiltonian (27). 
Since the nuclear spins are quantized with respect to 
the external field H, the dipolar interaction between 
nuclear spin I; and electron spin S can conveniently be 
expanded” in harmonics containing the angle between 
r; and H, 


Heensor= ggnbOnri | (— 21 252+314S_+31_S,) Hrip, 


—3(14S.+1,5,) #"*Po'e 168-3 (I_S,+1,S_) 
XFriPletiour—17 S$. HriP.2¢-2ieHr 


—}I_S_#riPze tour], (41) 


g=2.0023 is the electron spin gyromagnetic ratio. J, 
and S, are the components of spin parallel to the mag- 
netic field H; S;, S_, J,, J_ are the usual spin raising 
and lowering operators. 

Bardeen® and others*! have shown that the periodic 
function u;(r) in the Bloch wave function (26) can be 
expanded in a fairly rapidly converging power series in 
k, in the spherical approximation. This is automatically 
an expansion in spherical harmonics: 


ux(¥) =u, (7) Hikes u(r) * Pi 
— kc» 2u(r) kr Pt ods er 


The coefficients c, and cz are constant near the origin 
k=0, the bottom of the band. Brooks* has shown how 
they may be determined from the boundary conditions 
of the lowest wave function on the Wigner-Seitz sphere. 

Consider the second-order interaction between nu- 
clear spins i and 7, whereby an electron is excited from 
the initial state with wave vector k and spin orientation 
s to the intermediate state with k’ and s’ by the first 
term in the tensor interaction (41) of nuclear spin i, 
and back from the intermediate state to the ground 
state by the scalar interaction of spin 7: 


(42) 


16 
—y BeveuBw'(s | —27:S,+31i,S_+41:_S, |s’) 


 (s" | TieS2+3154S_+ 3155, |8){ E(k’) — E(k)} 


Ts 
Kete-e) nf et(k-k’) Tig, *y 8 Ar Poupr 2dr dQr; 
0 


Ketek’) Ri, *(R;)0;,(R,). (43) 
s and s’ are the spin quantum numbers in the ground 
state and the excited state. The integration over r; is 
extended only over the Wigner-Seitz sphere with radius 
r,=[(3/4r)\/a]* around the ith nucleus. The dipolar 
interaction drops off as r;-* and the small contribution 


2M. H. Cohen, Phys. Rev. 95, 674 (1954). 

% J. Bardeen, J. Chem. Phys. 6, 367 (1937). 

31 W. Kohn, Phys. Rev. 87, 472 (1952); H. Brooks, Phys. Rev. 
91, 1027 (1953). 
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from values r;>r, is neglected. Furthermore exp 
[i(k—k’)-r;] is replaced by unity inside the sphere. 
Actually (k—k’)-r, may not be a small quantity. 
There is no selection rule on k in the transition as the 
perturbation is localized. As far as nuclear spin orienta- 
tion is concerned the lattice is not periodic. Exp 
[i(k—k’)-r] might be expanded with the aid of (40), 
and the radial and angular dependence on r; could 
still be separated ; but in order to keep the integrations 
relatively simple the exponent is replaced by unity.” 
Substitute for wu, and u,- the expansion (42). Introduce 
the abbreviations 


od A. (kk’) = "2A *(k'R) 


= ggn:iBBn f *n'* (175) u(rsri dri, (44a) 


0 
04h (bk!) =""9A*(E'R) 


= geniBBn f m4 ,.*(7;) 2x (r;)r— dr;, 


0 


PP" A,(kk’) = ??'A*(k'R) 


= gen BBn { 1u,.* (73) “ux (rs)ri dr. (44c) 
0 


Usually the radial functions in (42) are real. Hence- 
forth the A’s will be written as real quantities, to 
simplify notation, although the calculation could easily 
be carried through for complex quantities. Use Eq. 
(39) and the relation 


8r 
f P,P P,P, de —(— Prt 3 OP YP) 


to evaluate the integral over the solid angle dQr;. 
Carry out the summation over the two electron spin 
orientations in the initial and intermediate states. The 
second-order interaction (43) then takes the form 


+3 (— 215+ 31st 3i_Ti,) LE) — E(k) DP 


nr 
eit ¥ 80, (8,8)| —es #0’ A :(k,k’) Hk’ P, 


4r 8x 
—— cok? 8'4A ;(k,k’) HkP,t—c,'cik'k PP’ A .(k,k’) 
5 15 
KS FP, oP — “P)) (45) 


Two permutations of this interaction must be made. 
The tensor interaction (41) may act on the nuclear 
spin j and the scalar interaction on 7, and the inter- 


® For quantitative work this simplification is not permissible. 
The exponential factors will in many cases give the larger con- 
tribution to the angular dependence. 
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mediate state may be excited by the interaction with 
spin j, while the return to the ground state is made 
through the interaction with spin 7. The last permuta- 
tion will result in the addition of the complex conjugate 
expression. Finally one has to sum over all occupied 
states k and all unoccupied states k’. This integration 
over k and k’-space will be carried out by separating 
again the angular and radial dependence. 

Assume that the quantities A, given by (29) and 
(44), and [E(k’)— E(k) }"' do not depend on the direc- 
tion of k or k’. The changes which result when this 
restriction is not made, will be discussed later. The 
angular integrations over dQ, and dQ, can then be 
completed, when exp[i(k—k’)-R,;] in (45) is expanded 
according to (40). The interaction between the nuclear 
spins becomes 

(— 20415. +30 4 1i-+31i_Ti,) 2? PBs, 
with 


B= f f (b:-+b2+bs)[E(k’)— E(k) 
as X (2) 8Rdkdk’, 
C101’ RR’ (A; 9?’ As +??’AjAs) ex 
X (kR’Ri;?)-4 Jy (RRi;) Jy(R'Rii), 
bo(k,k’) ne -cilh "EA A; As) ae 
X (kR’R?)—? Jy (RRij) Jy(R'Ris), 
b3(k,k’) =——co! kA; #4’ A; +A, *#’A,) axe 
' X (RR’R:?)-? Jy (RR) Jy(R'Rii)- 


12 
bi(k,k’) = 
(46) 


Repeat the same calculation for the other terms of the 
tensor interaction (41), using (39). As an immediate 
consequence of group-theoretical arguments, the final 
result for the pseudo-dipolar interaction is 


5,4? = {1;-1,—3R,j?(1.- Ri) (Ij Ri) } Bi. (47) 


As before the slight dependence of E(k)—E(k’) on the 
spin orientations has been neglected in this derivation, 
consistent with the spherical approximation. Spin-spin 
and spin-lattice relaxation processes can be defined for 
the pseudo-dipolar interaction in exactly the same 
manner as for the classical dipolar interaction. 

The constant B;;, which was introduced earlier in a 
phenomenological fashion, is now expressed quite gen- 
erally by Eq. (46). The integration over the wave 
numbers can be carried out with certain simplifying 
assumptions, which may be adapted to the particular 
type of solid under investigation. The same assumptions 
will be made as in the discussion of the ordinary ex- 
change interaction. 

Assume that an insulator has a narrow valence band 
and a relatively wide gap E,. Replace all higher bands 
by one conduction band with one effective mass m’. 
Then Eq. (30) may be used. Introduce suitable aver- 
ages for the products of A’s quantities, appearing in 
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the expressions for };, b2 and 63. These are analogous 
to the quantity |A;;|,? introduced in (29a). They are 
related to the atomic hyperfine structure in the corre- 
sponding p-state by an expression similar to (35). The 
p-type hyperfine interaction in the solid has the order 
of magnitude of the atomic p-type hyperfine splitting 
times the percentage of p-type character (or s—d 
mixture) of the solid state wave function. Integrate k 
from 0 to k;, the top of the valence band, given by (31) 
and k’ from 0 to infinity. The integral (46) appears to 
diverge at the latter limit. The reason is that the co- 
efficients c in the expansion certainly cannot remain 
constant to k—» ©. Physical reality requires a nor- 
malization. Assume c¢)/=c;/(1+ak”)-! and c¢2’=c;’ 
X (1+5k4)-! as k’— «©, or some such relations, to 
insure convergence. The integration over k’ can be 
carried through in the complex plane. In the result let 
a—0O and 6-0. The parabolic law E=hk”?/2m’ is 
assumed in the band region of interest. The integration 
over k is elementary: 


insB =a hm’ R,j* exp{ —h-(2mE,)'Ri;} 


4 
x| or PP’ A;+ PP’ AA; ett (1+/71 (2mE,) 1R;;) 


keRij 1 
x f (sinx—x ainaall” iia "4A + eA As ett 
0 


ktRij 
x f { (3—2?) siny—3x cosx}adx 
0 


1 
ee A+ VA A;)ett(3-+3h- (2mE,) IR, 


ktRij 


+2h-mE,R;;) f x snd | (48) 
0 


Comparison of this result with Eq. (32) shows that 
the pseudo-dipolar interaction has a radial de- 
pendence similar to that of the isotropic exchange, 
since the integrals go as R;,*’ for large R,;. At large 
distances the exponential factor is however dominat- 
ing. The ratio B;;A;j' has the order of magnitude of 
the hyperfine splitting ratio in corresponding p- and 
s-states times the relative amount of p- with respect 
to s-character of the wave function. Rather large 
variations around this order of magnitude can occur 
because of the different trigonometric functions in 
(32) and (48). Accidentally these could even make one 
of the quantities A,; or B;; zero, the other remaining 
finite. 

A similar state of affairs for the ratio B;;/A;; occurs 
in metals. In this case k in Eq. (46) must be integrated 
from 0 to km, and k’ from km» to infinity over the conduc- 
tion band. The integration of k’ can formally be changed 
from zero to infinity without changing the result. The 
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integrand is antisymmetric in k and k’ and consequently 
km km 

ff Getreteotew-207 

= X (24)-*h?*k*dkdk' =0. 
If one takes E(k’)—E(k)=(2m)—h?(k?—k?), the 


principal value of the integral over k’ can be evaluated 
in the complex plane. 


4 
met B= ti tnaRe rst tA PP’ A;-+-??'A:A;) Fermi 


kmRij 
x f (sinx— x cosx) (cosx— x sinx)xdx 
0 


1 kmRij 
+—c2(A; *’¢A;+ 44.s)rani f cosx 
15 , 
X {(—22+3) sinx—3x cosx}xdx 


1 
Ph #0’ A 5+ #4’ A:A;) Fermi 


kmR ij 
x f sinx{ (—2?+3) cosx+3x sins) xd. (49) 
0 


The integrations over x=R,; are elementary and can 
be carried through by repeated partial integrations. 
The integrals go as Rj? for large Rij. The pseudo- 
dipolar interaction therefore has a R,j* dependence in 
metals. The ratio B;;/Ai; in the metal has again the 
order of magnitude of the ratio of the hyperfine splitting 
is pure p- and s-states multiplied by the relative amount 
of p- and s-character, respectively. The effective average 
of the products of the quantities A is now approxi- 
mately equal to the value of this product on the Fermi 
surface, since electron scattering processes between 
k-states close to this surface contribute most to the 
integrals. At the Fermi surface the relative amount of 
p-character of the wave function may frequently exceed 
the amount of s-character. Taking the hyperfine split- 
ting in a pure p-state to be about 10 percent of that in 
an s-state, B;;/A:; can still easily assume a value of, 
say, 30 percent. This is particularly true if the varia- 
tions in the numerical values of the trigonometric ex- 
pressions involved in A,; and B,; are taken into account. 
The observed ratio B;;/A,;=0.3 for nearest neighbors 
in thallium is thus entirely reasonable. It indicates that 
the amount p-character probably exceeds the amount 
of s-character of the electron wave functions at the 
Fermi-surface in this metal. Quantitative conclusions 
are difficult to make, as Eqs. (34) and (49) are rather 
crude approximations for this case. The complicated 
band structure would certainly make a numerical 
evaluation of integrals (46) necessary. 

Resonance lines in pure isotopes will in general not 
have widths narrower than- the classical dipolar inter- 
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action due to exchange narrowing. Usually the pseudo- 
dipolar interaction will be a sizeable fraction of the 
exchange interaction, and the lines will be rather 
broader than Van Vleck’s formula for classical dipolar 
interaction would predict. 

The fact that the Pb®’ resonance in natural lead is 
rather broad, while Pb” with 22.6 percent abundance is 
the only stable lead isotope with nonzero spin, indi- 
cates pseudo-dipolar interaction in metallic lead. To 
estimate the magnitude of both the exchange and 
pseudo-dipolar interaction in this case, measurements 
on the Pb*’ in the presence of another isotope with 
nonzero spin should be made. The rapid increase in 
line width on alloying observed! in some cases may be 
due in part to exchange broadening. 

The pseudo-dipolar interaction has the same order of 
magnitude in cubic crystals as in noncubic crystals. 
One could formally add to the expansion (42) terms 
depending on the angle between r and the crystallo- 
graphic axes. For an axially symmetric crystal the 
leading term in the expansion for the wave function 
would be tke; "*P; and in the energy "Ps, where c is 
the unit vector in the direction of the axis of symmetry. 
In angular integration small terms are added to the 
result. One effect is that the constant B,; may depend 
on the angle between R,; and c. The changes are not 
observable in the experiments described in this paper. 

Finally the second-order perturbation with the tensor 
interaction (42) on both nuclei i and j must be 
considered. Two integrals over r; and r;, containing 
HrsPom and #iP.™ will now occur, instead of the single 
one over r; in Eq. (43). The interaction is still linear in 
the nuclear spins 7 and 7. From group-theoretical argu- 
ments it is clear therefore, that the final dependence 
on H contains no spherical harmonics of order higher 
than the second. Detailed calculations, carrying out the 
proper summations over the electron spin orientations 
and going through similar steps which led to Eq. (46) 
for the tensor-scalar interaction, confirm this. A typical 
term of the tensor-tensor interaction has the form: 


dn 2 
Wf f (4r)(=) 2c2? *’4A 5(R’,R) 8'4A -(k,k’) 
wt 


mJ y(RR)J,(R’R) (2r)-6 
2(ARijk’ Rij) *LE(k’)— E(k) ] 
—Uh- G-3Rj7(h- Ry) j-Rs)] 


x J A (any( =) c# "4A ,(h!,k) *4A,(k,k’) 


J 4(kR)J,(R'R) 
2(kk’R,})'[ E(k’) — E(k) ] 


RSk!*dkdk’ 








(2) *k®k’*dkdk’. 


The double tensor-interaction contributes both to the 
isotropic exchange and the pseudo-dipolar interaction. 





NUCLEAR SPIN EXCHANGE IN SOLIDS 


There are five additional terms containing different 
product combinations of *’¢A, *’A, and ??’A. These 
contributions should be added algebraically to the 
expressions for A,; and B,; derived previously. The 
additions have the magnitude of B;7A;;!. The relative 
correction for A;; in the case of thallium is only 10 
percent. Rather wide variations in the actual values of 
the corrections are possible due to the trigonometric 
functions. For nearest neighbors the magnitude of the 
tensor-tensor interaction is the square of hyperfine 
splitting in the p-state times the square of the frac- 
tional amount of p-character, divided by the Fermi 
energy. 
VII. CONCLUDING REMARKS 

For a heavy metal with a simple band structure it 
might be possible to give a quantitative interpretation 
of observed exchange interactions. The theory is not 
sufficiently refined to predict the numerical values found 
for thallium and thallium oxide, although it certainly 
gives the right order of magnitude. This is already a 
great deal better than can be achieved for electron spin 
exchange. Whereas a quantitative atomic theory of 
ferromagnetic properties is still lacking, nuclear ferro- 
magnetic characteristics may be derived from first 
principles. This is at least true for insulators, where the 
exchange interaction is confined to near neighbors and 
complications with conduction electron spins are 
avoided. The Curie or Néel temperature is determined 
by the value A;;, Eqs. (32) and (36). 

The dipolar terms will give rise to magnetic anisot- 
ropy below the Curie point. Van Vleck’s theory for 
ferromagnetic anisotropy® applies equally well to the 
nuclear case.*4 For cubic spin systems it is necessary to 
go to higher-order terms in the expansion of the free 
energy of the nuclear spin system to obtain anisotropy 
from the dipolar interaction. In such cubic crystals the 
magnetic anisotropy would then be inversely propor- 
tional to the external field H. In ordinary electron ferro- 
magnetism the anisotropy is not field dependent, be- 
cause the internal Weiss field is always large compared 
to H. A first-order field-independent anisotropy of 
nuclear magnetism in cubic crystals may arise from 
quadrupole-quadrupole coupling, if the nuclear spins 
have I>}. 

In our opinion, however, the main interest of nuclear 
spin exchange does not lie in the field of extremely low 
temperatures and nuclear ferromagnetism. It seems 
more important that the exchange and pseudo-dipolar 
interaction give information about the electron wave 
functions in the lattice and, in particular, about their 
nonspherical character. 

The interaction (27) does not represent the complete 


% J. H. Van Vleck, Phys. Rev. 52, 1178 (1937). 

“Tt applies, strictly speaking, only to the nuclear case. The 
Heitler-London perturbation procedure does not converge for the 
case of electrons. 
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Hamiltonian® for the nuclear spin J. If the nucleus has 
a quadrupole moment it will interact with a noncubic 
charge distribution of the electron which will have 
matrix elements connecting the ground state k with 
the intermediate excited state k’. In second-order per- 
turbation dipole-quadrupole and quadrupole-quadru- 
pole interactions will result. The latter is quadratic in 
the nuclear spins I; and I; and has the order of magni- 
tude of the square of the quadrupole hyperfine splitting 
over the Fermi energy. Since the quadrupole interaction 


_usually produces relatively small deviations from the 


interval rule, and can only operate on the “non s” 
character of the wave functions its order of magnitude 
will in general be even smaller than the tensor-tensor 
interaction discussed in the preceding section. Since the 
thallium isotopes have J=3, the interaction is absent 
in our experiments and will not be discussed further. 

Another term which has not been considered is the 
nuclear spin-electron orbit interaction 2gy8v6r;“I;- L. 
It can be safely assumed that the second-order coupling 
between nuclear spins in solids due to this term is 
orders of magnitude smaller than the interaction via 
electron spins, as has been shown to be the case for 
molecules.‘:> One can also introduce the electron spin- 
orbit coupling and consider the interplay of spin and 
orbital effects. To obtain a nonvanishing result for the 
nuclear spin interaction, the operators L and S should 
in general each occur in even powers in the Hamil- 
tonian. The orbital effects will not be discussed in this 
paper. Although they make only a minor contribution 
to the nuclear spin-spin coupling, they are of great 
importance for the chemical shifts in solids. The large 
value of this shift in thallium oxide points to a small 
energy gap between bands, as did the large value of the 
exchange constant. It follows from Ramsey’s theory 
that the relative anisotropy of the chemical shift 
should have the same order of magnitude as the anisot- 
ropy in the diamagnetic susceptibility, although they 
need not have the same value. A relative anisotropy in 
the chemical shift of 34 percent is thus entirely reason- 
able. It would be desirable to develop a quantitative 
theory of the orbital effects in solids on nuclear spins. 

A few words must be said about the effect of nuclear 
motion on the exchange interaction. The influence of 
diffusion or other types of motion in the nonrigid 
lattice and in liquids on the pseudo-dipolar interaction 
is exactly the same as on the classical dipolar inter- 
action, which has been treated in great detail.*” 

The exchange broadening between unlike spins can 
undergo a motional narrowing too. A pure rotation, 
leaving the internuclear distance between the two 
nuclei unchanged, has however no effect in this case. 


35 A. Abragam and M. H. L. Pryce, Proc. Roy. Soc. (London) 
A205, 135 (1951). 

36 N. F. Ramsey, Phys. Rev. 86, 243 (1952). 

37 Bloembergen, Purcell, and Pound, Phys. Rev. 73, 678 (1948); 
R. Kubo and K. Tomita, J. Phys. Soc. Japaa 94, 888 (1954). 
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The exchange interaction has no angular dependence. 
Unlike the dipolar interaction, it does not average out 
on rotation and this is the reason that exchange be- 
tween unlike neighbors in molecules is observable in 
liquid and gaseous samples. If the nuclei with their 
spins diffuse rapidly with respect to each other, the 
exchange interaction A;; [Eqs. (32) and (34) ] has to 
be averaged over all R;; and then summed over all j. 
Another way of describing this same phenomenon is to 
say that a neighboring position of nucleus 7 is at one 
time occupied by a nuclear spin 7 pointing up and some 
time later by another spin & pointing down. A complete 
analysis of this motional narrowing of exchange broaden- 
ing is beyond the scope of this paper. It is somewhat 
unexpected that the resonance in liquid metallic thal- 
lium could not be observed and the Hg! resonance 
in liquid mercury is very broad. The thallium resonance 
in liquid thallium-mercury alloys is also extremely 
broad. These facts seem to indicate molecular associa- 
tion in the liquid metals. In the rapidly rotating and 
diffusing molecules the exchange interaction is not 
averaged, and the interchange of nuclei between mo- 
lecular assemblies which would lead to narrowing is too 
slow. A more careful investigation must be made 
before definite conclusions can be drawn. 

Finally the universal character of the exchange 
coupling between nuclear spins is stressed. It should 
be considered whenever heavy atoms are involved, not 
only in metals, but also in valence and ionic crystals as 
well as in liquids and molecules. In particular, it will 
make a contribution to splittings observed in molecular 
spectra with the molecular beam method. A large spin- 
rotation interaction has been found** in the TICl 
molecule. It follows from a theoretical analysis of the 


88 Carlson, Lee, and Fabricand, Phys. Rev. 85, 784 (1952). 
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TICI molecule in the “very weak field” case*® that an 
interaction of the type Alci-Im, which was not con- 
sidered, will give rise to a similar splitting of the energy 
levels, for which Fi=Icit+-J, J, Ici; and I are good 
quantum numbers, as the considered interaction 
C2I7- J. In principle both interactions contribute to 
the observed fine structure of the F=F,+J7 levels, 
and the observed quantity of 73 kc/sec corresponds to 
C.—A, and not to C; alone. It can be shown, however, 
that the contribution from A must be small in this 
case. A distinction between the two types of interaction 
can be made by changing the isotopic species, since A 
is proportional to the gyromagnetic ratio of the chlorine 
isotopes, whereas C2 is not. The gy-factors of the 
chlorine isotopes have the ratio 0.8324, whereas the 
observed splittings for the TICI*! and TICI* molecules 
were the same within a 3 percent accuracy. Similarly 
the pairs Rb*F and Rb*’F have roughly the same 
splitting,” although the g-factors for the Rb isotopes 
differ by more than a factor three. Only when two 
heavy isotopic species are present in the diatomic 
molecule a contribution of A of 10 kc/sec or more can 
be expected and the nuclear spin exchange might be- 
come observable in molecular beam experiments. 
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The free precession of an ensemble of nuclear quadrupole moments in an axial electric field gradient is 
studied by the pulsed nuclear induction method. A quantum-mechanical analysis describes the free precession 
and spin echo signals which result from the application of single and double pulses of radio-frequency field 
at the condition of zero-field quadrupole resonance. Beat modulation effects exhibited by free precession 
signals in a small constant external magnetic field are predicted by analysis. An alternative semiclassical 
description of quadrupole precession is given, which is analogous to the macroscopic nuclear induction 
equations of Bloch. Theory is verified by observation of free precession signals of chlorine in NaClO3. 





I. INTRODUCTION 


ECENTLY the authors published brief reports!” 

on the discovery of magnetic induction effects due 
to free nuclear precession in pure quadrupole resonance. 
These effects follow very closely the spin echo phe- 
nomena® observed in nuclear moments coupled to a 
strong externally applied magnetic field. The aim of 
this paper is to develop the theory needed to explain 
induction phenomena in quadrupole systems and to 
describe the basic experiment. 

When a system of nuclear spins is placed in a mag- 
netic field Ho in the usual nuclear induction experiment, 
the thermal equilibrium established between the spins 
| and their surroundings provides a Boltzmann population 
distribution among the various spin states. The nuclear 
ensemble therefore exhibits a nonzero macroscopic 
magnetization Mb parallel to Ho. Suppose that Mb is 
suddenly rotated to make an angle @ with respect to Ho. 
Mp will then precess about Hy at the Larmor frequency 
wo= Ho, where vy is the nuclear gyromagnetic ratio. An 
inductive coil wound about the nuclear sample with its 
axis perpendicular to Ho will then, as a result of free 
nuclear precession, have a voltage induced in it which 
is proportional to woM> sind. In practice such a voltage 
decays in a time 72, where T> is the time taken for dif- 
ferent nuclei in the sample to loose precessional 
coherence because of the following effects: (1) different 
fixed rates of precession throughout the spin sample 
volume caused by static internal “local” fields or by an 
applied inhomogeneous magnetic field; (2) spin lattice 
relaxation; (3) processes which cause random fluctua- 
tion of precessional frequency and phase arising from 
| spin-spin coupling and, particularly in the case of 
liquids, molecular self-diffusion in an external field 
gradient.’ 

In the spin echo experiment, free induction signals 
are observed after transitions have been induced in the 


* Present address: Kammerlingh Onnes Laboratory, University 
of Leiden, Leiden, The Netherlands. 
1M. Bloom and R. E. Norberg, Phys. Rev. 93, 638 (1954). 
2 E. L. Hahn and B. na 9 Phys. Rev. 93, 639 (1954). 


3 E. L. Hahn, Phys. Rev. 80, 580 (1950). 


spin system by an rf magnetic field, perpendicular to Ho, 
applied in the form of pulses. If the rf field of magnitude 
2H, at a frequency w~wo= Hp is applied for a time ty, 
the nuclear magnetization vector Mp is rotated through 
an angle 6=~yHit., provided that t,<«T»>. If pulses are 
applied at times =0, =7, and ‘=T for example, free 
induction signals are observed following each of the 
pulses; and they occur in addition at times ‘= 27, T+7, 
2T, 2T—7, and 2T—2r as spin echo signals. The proper- 
ties of free induction signals have facilitated accurate 
measurements of the spin-lattice relaxation time 7), the 
total relaxation time 7; (or inverse line width), and 
self-diffusion in gases and liquids in magnetic resonance 
experiments*-> where such measurements would have 
been very inaccurate or impossible by steady state 
methods. For a detailed physical model of spin echoes 
the reader is referred to the literature.*.® 

In the zero-field nuclear quadrupole resonance ex- 
periment,’:* an oscillating magnetic field causes transi- 
tions between energy levels corresponding to different 
orientations of the nuclear charge distribution with 
respect to crystalline electric field gradients. Since the 
electric interaction is invariant with respect to reversal 
of the nuclear spin direction, no macroscopic nuclear 
magnetization is present at thermal equilibrium. It is 
therefore not obvious that free induction effects similar 
to spin echo phenomena previously studied can be 
produced here. From the quantum-mechanical calcu- 
lation which follows,? it will be seen, however, that the 
effect of inducing magnetic transitions by pulses of the 
rf magnetic field is to produce a macroscopic oscillating 
nuclear magnetization. For the case of a symmetric 
electric field gradient VE, this magnetization is pro- 
duced in the plane perpendicular to the axis of sym- 


4H. Y. Carr and E. M. Purcell, Phys. Rev. 94, 630 (1954). 

na = F. Holcomb and R. E. Norberg," Phys. Rev. (to be pub- 
lished 

6 E. L. Hahn, Phys. Today 6, No. 11, 4 (1953). 

7H. G. Dehmelt and H. Kruger, Z. Physik 129, 401 (1951). 

8R. V. Pound, Phys. Rev. 79, 685 (1950). 

°A similar quantum- -mechanical method for calculating spin 
echoes is given by E. L. Hahn and D. E. Maxwell, Phys. Rev. 
88, 1070 (1952). 
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metry of VE, and its précessional motion can be given 
a semiclassical description. 


II. THEORY 


The electric field gradient tensor VE is assumed to be 
symmetric about the z axis. The quadrupole interaction 
Q-vVE, where Q is the nuclear quadrupole moment 
tensor, then satisfies the eigenvalue equation® 


—eéqQ 
—(Q- VE)¢n=———_[3m*— I (I+.1) m= Emm, 
(Q:vE)¢ wap I(I+1) ]om= End : 


where eg is the scalar electric field gradient, eQ is the 
scalar nuclear electric quadrupole moment, and J is the 
nuclear spin (J>}). We choose the representation in 
which ¢,, is the eigenfunction of J,. The matrix elements 
for the spin operators J, and J,=J,+il, in Dirac 
notation are then given by 


(m|Is.| m1) =[(I-+m) (I-Fm+1) }, 
(m|I,|m)=m. 


(2) 


An rf field 2H; which couples to the nuclear magnetic 
moment induces transitions corresponding to Am= +1. 
Because of the degeneracy of the -:m states, there are 
I transition frequencies for integral spin and J—} 
transition frequencies for half-integral spin. When no 
constant magnetic field is applied, the calculation of 
induction signals for all transitions are identical except 
for multiplicative constants. It will therefore be suf- 
ficient to treat only the singlé transition for J=% in 
order to illustrate the general properties of the induction 
effects. 


A. Zero External Magnetic Field Case 
The Hamiltonian in this case is given by 
KH=—Q-VE-yhl-H(i). (3) 


The rf coil provides a magnetic field |H(é)|=H, 
= 2H, coswt along the x axis for 0</<t, and H,=0 for 
t>t». The wave function y for J=#% is expressed in 
terms of the eigenfunctions of the electric quadrupole 
interaction as 


y= ” Cn(t)bmem*, (4) 


where wm=w-m=Em/h. We now solve the time-de- 
pendent Schrédinger equation, 


ih)=3ey, (5) 


in the region 0</<zé,. After substituting (3) and (4) 
into (5), and using the orthogonality properties of ¢m, 
we obtain 


Cm= iter (m|I4.| m—1)e*om-om-D IC, 


+(m|I_|m+1)e*on-em+)'C,, 1] coswt, (6) 
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where w:= 7H. For J=3, when 
| o— (wy—) | K1/tacor, 
near resonance, Eq. (6) gives 
Cay= (N3/2)iorC a4, 


’ (7) 
Ciy= (v3/2)te1C 44+ 2iwiC xy Coswt. 


For our experiments, where the conditions wt,>>2m and 
w:<w are well satisfied, the term containing coswt in 
(7) may be neglected since its contribution to the 
induction will be vanishingly small. Equations (7) have 
solutions: 


Ca3(2)=Ca3(0) cos(V3w1t/2)+7C 43 (0) sin(V3w1t/2), 
Cay (t) = C43 (0) cos (V3w1t/2) +1C 44 (0) sin (V3wit/2) . 


At ¢=0 the spin system is in thermal equilibrium and 
an excess spin population is established in the m= +3 
states for positive e?gQ. With the normalization con- 
dition that }>m|Cn(#)|?=1, and considering the nor- 
malization only with respect to the excess population, 
the initial population coefficients at :=0 become 


Cy(0)=1/v2, C4(0)=e%/v2, and C(0)=0, (9) 


where 6 is a random phase factor over which the 
physical observables must be averaged. This variable 
expresses the fact that in coming into thermal equi- 
librium with the lattice the nuclei have interacted with 
randomly phased electric and magnetic fields.” After 
applying the conditions in (9) to Eqs. (8), the popula- 
tion coefficients after the rf pulse are 


C4 (bo) = (1/V2) cos(V3witw/2), 
C4 (bo) = (i/V2) sin(V3wiho/2), 
C_4(bo) = €®C (bo), C4 (to) = 06°C 4 (bo). 


The solution of the Schrédinger equation (5) during free 
nuclear precession when ¢ >t, and H,;=0 is 


(10) 


(11) 


ui 
Y= DL dmCm(hoeiom(—™), 
m=—} 


where expressions for Cm(t») are given in (10). The free 
induction signal will be produced in the «xy plane by a 
nonvanishing nuclear magnetization which is propor- 
tional to the expectation value of J,, expressed as 


L=V|LW=W* [Le W+iW*lTyly). (12) 


Inserting y from (11) into (12) the average values of 
the operators are found to be 


T,= (V3/2) sinV3wity sinwo(t—t,), I,=0, I,=0, (13) 


where wo=w3—w4= eGQ/ (2h). 


1 R. C. Tolman in Principles of Statistical Mechanics (Oxford 
University Press, London, 1938), p. 351. 





MAGNETIC INDUCTION 


A macroscopic oscillating nuclear magnetization pro- 
portional to J, is thus set up along the « axis of the rf 
coil, and it is in fact linearly polarized along this axis. 
We will show this more clearly in the next section by 
proving in general that the net magnetization vectors 
derived from the +m and —™m states independently 
obey quasi-classical differential equations of nuclear 
induction analogous to those obtained by Bloch." The 
quantity V3w,t, represents the angle @ through which 
both of the magnetization vectors associated with +-m 
and —m have been rotated by Hi. 

As in the case of free precession in a field Ho, different 
rates of precession for nuclei can cause a dephasing and 
consequent attenuation of /,. This can be introduced 
phenomenologically in our theory, as was done before,’ 
by averaging J, over a distribution of frequencies: 


8 (wo' — wo) = Lexp— (wo’—wo)?/ (25?) ]/(2m6*)!#, (14) 
where 


f 8 (wo —wo)dw’ = 1, 


0 


and 6 is the root-mean-square deviation in precessional 
frequency (relative to an average frequency wo) due to 
a spread in the field gradient eg over the sample volume. 
Strains and imperfections in the crystal and a tem- 
perature gradient over the sample (eg is temperature 
dependent) will contribute to 6. The effect of magnetic 
dipolar broadening and of externally applied magnetic 
fields will be discussed later when a constant magnetic 
field is introduced into our calculation. With decay now 
due to an eg variation, neglecting all other relaxation 
affects, the observed induction for ¢>t» from (13) [let 
wo—wot (wo’—wo) | becomes 


I= f g (wo! — wo) Leder’ 


= (V3/2) sinV3wilo sinwo(t— ty) 
Xexp[—&(t—ty)?/2]. (15) 


When a second pulse of width #, is applied at =27, an 
echo is formed, given by 


Tzy= — (V3/2) sin(V3wity) sin?(V3wito/2) sinwo(t— 27) 
Xexp[— (¢—27)*6?/2], (16) 


where {>7-+#,. These solutions are similar to those 
obtained for J=} in nuclear magnetic resonance, 
except that w; in that case is replaced by V3w; here. 
Equations (15) and (16) apply to the case of a single 
crystal where all nuclei are oriented by a gradient eq 
which is perpendicular to the axis of the rf coil. In 
a powdered crystalline sample, the effective rf com- 
ponent is H; sin@, where 0, is the angle between H; and 
the axis of spin quantization. The observed component 


4 F, Bloch, Phys. Rev. 70, 460 (1946). 
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of J, along the coil axis is 7, sin#,. Introducing these 
considerations into (15), for example, the observed 
induction for a powdered sample is then obtained by 
averaging J, with respect to 6; over a sphere. The 
integral 


S= i sin@, sin (V3w1 sind it»)dO, 
0 


is then applied, where S replaces the factor sin (V3w1tw) 
in (15). The integration gives the summation 


02 (—1)"(2vV3w1ty)?"*n!(n+-1)! 
0 (2n-+1)!(2n-+3)! 
8 


C-) Jo 1 V3u1 
a-hipGeas~¥ sein 
3 n=1 (2n+1)(2n—1)(2n+3) 








Since this series converges very rapidly, only the first 
few Bessel functions are of importance. The principle 
behavior of S is then determined by Ji (V3witw). 


III. QUASI-CLASSICAL NUCLEAR INDUCTION 
EQUATIONS FOR ZERO FIELD 
QUADRUPOLE RESONANCE 


In this section, we show how a general set of differ- 
ential equations may be obtained to describe a quasi- 
classical vector model of the expectation values of 
nuclear magnetization discussed in the previous sec- 
tion.” The procedure in the last section has given the 
result of this process for /= %. A vector model of the 
behavior of the magnetization vectors corresponding to 
the +-m and —™m states will now be obtained. This 
method however does not apply generally in the 


presence of a constant magnetic field Ho. 
Let am(#)=Cm(t)e—™*, and let 


Y=) 0 Om(t)bm (17) 


represent the total wave function only for those nuclei 
which have positive m values. Since there is no mixing 
between + and —m states when the Hamiltonian (3) 
is used, it is sufficient to analyze the motion for +m 
states only. The final result will describe the corre- 
sponding —m states also except that they perform a 
spin precession in a direction opposite to that of the +m 
states. In the Hamiltonian (3), let 


I-H= (1,H_+J_H,)/2, 


where H,.= H,+iH,. Using y from (17), the Schrédinger 

equation (5) provides a differential equation for the 

population coefficient dm: 

an= 40mOm+ (ty/2){H4[U— m) (I-+m— 1) Pamsy1 
+HA_[(I+m)(I—m+1) }lam—1}. (18) 


The resonance experiment provides a single rf fre- 


12 A related macroscopic description of quadrupole systems is 
given by F, Lurcat, Compt. rend. 238, 1386 (1954). 
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quency at a quadrupole transition corresponding to 
+m—+m—1. Two differential equations are then 
obtained from (18), one for d» and another for @m_1 
(substitute m—1 for m). Terms involving the transi- 
tions m—1—2m—2 and m=m-+1 may be dropped be- 
cause the applied rf excites only the m-m—1 transi- 
tions. The observed spin angular momenta are given 
by the following general definitions of expectation 
values: 


1, =) MQn*Gn; 
m 


' (19) 
1,=>d(m| 1,|m 1) @m*Om¥13 


and their derivatives are given by 


d1,/dt=. m(Gim*a.nt+Om*Gm) ; 


m 


aI,,/di= z (m | Ts | m+ 1) (Gm*Omt1+Om*Gmt 1). 


(20) 


Expressions for @m and @m—: from (18) are substituted 
into (20). Terms then are collected together in (20) 
which are the definitions of J, , in (19). The net macro- 
scopic magnetic moment contributed by the m states 
is defined by M.z,=I1:,yyhN, where N is the total 
number of spins in the crystal. The net macroscopic z 
component of magnetism which is involved in the 
induction will be shown to be defined by 


M.=~hN (|am|?— [@m—s|?)/2, 


and is not to be confused with the total magnetization 
which is proportional to J, in (19). The nuclear induc- 
tion equations are then 


dM ,/dt=M y— My, 
dM ,/di=y{M.H.—M wz, 
dM ,/dt=y_M.H,—M,H: |, 


(21) 


where ¢=(J-+m)(I—m+1) and wo is the transition 
frequency. These equations are similar to those ob- 
tained by Bloch" in describing the behavior of spins 
in a magnetic field. Following his procedure, a trans- 
formation is made to a frame of reference in which 
the xy plane rotates at the frequency of the applied rf 
field H,. Therefore the vector M(M.,M,,M,) in the 
laboratory frame transforms to the vector M(u,v,M,) 
in the rotating frame, where Hj is fixed parallel to 
the « component. The transformation is given by 


M,=u coswl—v sinwt; M,=# (u sinwi+2 coswt) ; 
H.=H,coswt; Hy= FH; sinwt. 


For positive e’gQ, looking along the positive z direction, 
the minus sign is chosen for counterclockwise precession 
of the magnetization for +m, and the plus sign is 


BLOOM, HAHN, AND HERZOG 


chosen for clockwise precession of the magnetization for 
—m. Both rotating components of the rf field are 
therefore utilized in the quadrupole resonance. Apply- 
ing (22) to the Eqs. (21), the nuclear induction equa- 
tions in the rotating frame are 


du/dt+- (wo—w)v=0, 
dv/dt— (wo—w)u+ (Im) (IFm+1)o,M.=0, 
dM ,/dt—w=0, 


(23) 


where the signs are chosen according to the independent 
transitions -+-m—2-+:m—1. The observed induction due 
to equal contributions from both signs of m is obtained 
from (22) as 


M.=M,(m)+M.(—m) =2(u coswi—v sinw!), 


My=M,(m)+M,(—m)=0, 
M.=M.(m)+M.(—m)=0. 


(24) 


No signal of induction would be detected in a coil 
oriented at right angles to the rf coil, which serves in 
our experiment to detect 9, as well as transmit 2H. 
For the case where a single rf pulse is applied, as dis- 
cussed in the previous section, the induction signal is 
described by 


M2= 2 (I+ | m|)(I—|m]| +1) ]}'Mo 
Xsin[ (I+ | m|)*#(1— |m]| +1) toitw | sinwo(t— to) (25) 


upon solving (23), where the net magnetization at 
thermal equilibrium is 


Mo=hN (|Cimj (0) |?— |Cimj—1(0) |*)/2 for a given |m|. 


R.F. COIL 


Fic. 1. Vector model of macroscopic spin precession in an axial 
electric field gradient for positive signs of egQ and y. The special 
case is shown where the macroscopic moment vectors Mim; and 
M__\m, are rotated by 90° about the rf field H; during a single pulse. 
This vector model is not to be taken as a rigorous representation 
of the motion described by Eqs. (21) during a pulse, since the con- 
dition M\m|= (u?+-2°+M,?)}, in the absence of attenuation, does 
not apply here. 
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Fic. 2. Energy level diagram for spin J= 3 quadrupole interaction in small magnetic field Ho. A system of coor- 
dinates is chosen in which the azimuthal coordinate ¢; for H, is zero. The parameters are 


a=((f—1)/(2f)},, 


For (Ejmj—Ejmj—1)/RT&K1, Boltzmann statistics gives 
Myo= (E\mj—E|mj—1) Nyh/(2kT (2I+1) ], where E\m| 
— E\mj—1= 2€qQ(2|m|—1)/[7(22—1) ]. The net macro- 
scopic magnetization M,=yhN[|Cim(é)|?— |Cjmi—1 
(t)|?]/2 is an apparent moment which is the analog of 
the M, normally considered in a purely magnetically 
coupled nuclear induction system. After the single pulse 
which gave M, in (25) for a given |m|, then 


M.=Mocosl (I+ |m]|)#(I—|m|+1)4witw], (26) 


which implies that M, can reverse sign if the cosine 
becomes negative. Physically, of course, the actual mag- 
netism due to superposition of a pair of |m| and |m|—1 
states can never reverse sign because of the constraint 
of the quadrupole coupling acting on the nuclear charge 
distribution. A physical picture of the change in the z 
component of magnetism for a given sign of m is ob- 
tained by evaluating the actual magnetism yhNI.= Mr 
directly, where J, from (19) is the expectation value. 
During the application of a single pulse the am), |mj—1 
coefficients can be obtained from (18) and substituted 
into (19), which gives 


Mr=YAN{ |Cimj(O) |?+ | Cjm—1(0) |? 
+3L|Cjm (0) |?— |Cjmi—1(0) [7] 
Xcosl (I+ |m|)#(I— | m| +1) toitw J}. 


The term (27-B) corresponds to M, in (26); let (27-A) 
be represented by M;. The M, change in magnetism 
would appear along the z axis in an experiment where a 
circularly polarized rf field H; is applied to excite m 
states of one sign only. Similarly, M, could then be 
observed. This procedure would be necessary in order 


(27-A) 


(27-B) 


b= C(f+1)/(2f)}, and f= (1+4 tan). 


to prevent the cancellation of M,and M, by equal and 
opposite values produced in our experiment, where two 
circularly polarized components of H; are rotating in 
opposite directions. Figure 1 illustrates the motion of 
the separate vectors M4}; and M_)n; Lhowever, Mm 
~ (w+v+M,’)'] corresponding to each sign of m, 
which are superimposed upon the constant amounts of 
magnetism M;,. 

If a small magnetic field Hp is applied to the spins 
along the z axis for example, such that yhHo<e’qQ, 
the entire vector diagram can be thought of as pre- 
cessing about Hp in a direction determined by the sign 
of y. Therefore the symmetry of alignment about the 
x axis Of M4}m; and M_jm, is removed, which in essence 
means that the degeneracy of the +m states is removed. 
A low-frequency modulated induction signal will then 
appear along the y axis as well as the x axis due to the 
additional precession imposed by Ho. The next section 
presents an analysis of this case. 


B. Small Magnetic Field Case 


First a single rf pulse is applied with conditions im- 
posed as in case A for [= in zero field, with the addi- 
tional requirement that the conditions yHi, 1/t.>Ho 
apply. These conditions can be satisfied approximately 
in the experiment for values of HoS 20 gauss, and imply 
that Ho can be neglected during the application of the 
pulse since the spin system scarcely precesses because 
of Ho during the time ¢,,. The general Hamiltonian now 
is 


=—Q-vE-yhl-[Ho+H()], 
where H(t) is chosen as before, and Hp is the applied 


(28) 
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constant field, The directions of Ho) and H; in the 
coordinate system having the symmetry axis of VE as 
the z axis are defined by the angles 4, go and 41, ¢1 
respectively. The matrix of the square bracket may be 
written in terms of the eigenfunctions of J,, where the 
only important off-diagonal terms to first order are 
those connecting the m==+} states for egQ>vyhHo. 
Diagonalization of the matrix yields the eigenstates and 
eigenvalues shown in Fig. 2.'*-"4 The effect of the mag- 
netic field is not only to remove the degeneracy of the 
+m states, but to mix the m= =} states, and yet leave 
all other m states pure. Thus all transitions other than 
|m|=4<2|m|=8 are similar to the case of J=3 with 
Hy applied in the z direction. 

The solution of the Schrédinger equation in the 
absence of H; is then given by 


¥=Cy(ti)by exp[ — 7 (wy +3M% cosh) (t—#,) ] 
+C_4(t:)b-4 exp —1(wy— $2 cosh) (¢—t;) ] 
+C;(ti\b4 expl—i(w4+3 fM% cos) (t—t,)] 
+C_(t,)o_ exp[—1(w,—3f% cosh) (¢—#,) ], 


where %=yHp and C;(t;), C_y(t:), C+(t;), and C_(t,) 
are population coefficients determined from initial con- 
ditions at ¢;=4, following the rf pulse. The coefficients 
Ci(tw) and C_4(t,) are obtained directly from Eq. (10) 
in Sec. II-A, whereas C(t.) and C_(t,) are determined 
from relations which equate coefficients C+4(4.) in (10) 
to the collected coefficients of #4; in (29) when t;=¢,. 
The expectation value of J, is then computed, as 
before, to describe the free precession after the pulse, 
now using y in Eq. (29). A lengthy expression is ob- 
tained, which we will not reproduce here. Low-fre- 
quency terms involving Q) appear which may be 
neglected because they produce a negligible induction 
effect for QcKwo. The induction voltage developed 
across the coil is proportional to Z,. The observed 
macroscopic moment yhNI, for t> ty is given by 


(29) 


M,= 2v3M. 0 sin0; sinwot sin (V3 who sin9;) 


«| (2) ovens 


(=) co exo) (3+.-]}, (30) 


with no decay effects included. A similar expression is 
obtained for M,. H; is chosen here in the experiment to 
be perpendicular to Ho. Now in addition to the fact 
that a spread in the field gradient eq causes a decay of 


3B. T. Feld and W. E. Lamb, Phys. Rev. 67, 28 (1945). 
14 C, Dean, thesis, Harvard University, 1952 (unpublished) and 
Phys. Rev. 96, 1053 (1954). 
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M,, a spread in Hp due either to external field inhomo- 
geneities or to local magnetic dipole fields of nuclear 
neighbors will likewise cause a static distribution of 
induction frequencies in the crystal, given by 


8 (Qo — No) = Lexp— (Qo — N)?/ (2?) ]/ (2a?) 


This distribution function, similar to (14), can be 
assumed for the distribution in Ho, particularly if it is 
due to nuclear neighbors, where 7 is the root mean 
square deviation of frequency due to internal dipolar 
magnetic fields. The average of (30) over both the dis- 
tribution functions (14) and (31), where we let 
wo wot (wo’ —wo) and QA + (Qo’— Qo) in (30), is 


(31) 


(Mz)w=2V3M 9 sind; sinwof sin (V3witw sinO:) 
Xexp(—&?/2) 


+1 
¥ | (+) cos[_Q% (cos) (3— f)t/2] 
Xexp[—1? (cos) (3— f)*?/8] 


f-1 . 
+ en, cos[% (costs) (3+f)1/2] 


Xexpl—1? (cos%o)(3+f)°?/8]}. (32) 


Of course, the angle » for the internal field at different 
nuclear sites is random in direction, but we choose here 
not to average over this angle, which must be done to 
describe the actual decay. We assume in (32) that 
n is the equivalent of an average internal field which 
lies along the direction of the externally applied Ho and 
is independent of 4». It should also be kept in mind that 
our discussion here does not include any decay due to 
T; or to effects which involve time dependent fluctua- 
tions of the local dipolar fields, both of which are to be 
treated in later papers. The magnetization component 
therefore predicts an induction signal modulated by 
two different frequencies which are associated with the 
splitting of the original quadrupole resonance into lines 
symmetrically distributed relative to the original line 
(see Fig. 2). 


C. The Quadrupole Spin Echo in Small Field 


In addition to the first rf pulse applied in the above 
case, let a second pulse be turned at ‘= 7 and turned off 
at t=7r+4t,. The initial spin population coefficients for 
the second pulse at ‘=r are obtained from Eq. (29). 
The solution of the Schrédinger equation then provides 
new population coefficients at ‘=7+t, which are 
matched to the free precession solution during the time 
t>7+t,. The induction without decay for t217+ty 
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M2= —2V3Mo snd sin(or) cov) | ey, cos[ 2o(cos6o) (3— f)t/2] 


| Q s6,)(3 2 i 33-1 
1) cos[Mo(cos6s)(3+f)4/2]} sinwet (33-1) 


+1 
-} sin (2w to) | (=) cos[ MQ (cos@o) (3-— f) (t— t)/2] 


: . 
+ ez. cos[_Qo(cosOo) (3+ f)(t—7)/2]} sinwo(t—7) (33-11) 


+sin (wity) sn'() (=) cos[_(coso) (3— f) (t—27)/2] 


ek ; 
+ Gr cos[_Qo(cosOo) (3+ f) (é—27)/2] sinwo(t—27) (33-III) 


+sin (wity) sin'(“) ( tk ) | cos[ 32 (cos4o)t/2_] cos[_ f2o(coso) (t— 27) /2 ]+-cos[3Qo(coso) (t— 27)/2] 


2f 


X cos[_fQo(cos6o)t/2_]—cosl3Q(cos8o)t/2_] cos[_ f2o(coso)t/2 ] sinw(¢— 27) . (33-IV) 


A similar expression is obtained for 9W,. In order to 
interpret the various terms as they contribute to the 
observed signals, it is essential that each term in (33) 
be averaged over the assumed static Gaussian dis- 
tributions in eg and Qo"given by (14) and (31) respec- 
tively. Equation (32) indicates the result of such an 
averaging after the first pulse. Each of the terms in (33) 
becomes multiplied by a Gaussian decay factor 
exp(—an?t;?/2—6%2?/2) where a depends upon factors 
in the cosine arguments involving %, and /; is the time 
factor which accompanies them. The time /2 pertains to 
the particular time factor in the sine function which is 
averaged over wo’—wo. Term I, which is due to a 
remnant of the free precession following the first pulse, 
is usually not observed after the second pulse and in 
fact is avoided by making 7 sufficiently large. Term II 
is the free precession tail which follows after the second 
pulse if the average magnetization M,(t,.) happens to 
be nonzero after application of the first pulse. M,(7) 
can also be nonzero due to thermal relaxation of nuclei 
excited by the first pulse. The latter effect is not in- 
cluded here since we have assumed 7’; to be infinite, and 
hence M,(t»)=M,(r). An extra term omitted here de- 
scribes a signal that can be utilized to measure 7; (see 
reference 6). The measurement of 7, in some quadru- 
pole systems will be described by one of us (M.B.) in a 
later paper. Term III is the “normal” spin echo at 
t=27 which has a maximum amplitude independent of 
the static spread in precessional frequencies throughout 
the crystal. Term IV is an additional contribution to the 
spin echo signal discussed previously by Bloom,'® which 
has a periodic amplitude as a function of 7, and so gives 
rise to a slow echo envelope beat with a period ~22/M. 


15M. Bloom, Phys. Rev. 94, 1396 (1954). 





The contribution from this term, however, attenuates 
rapidly in a time of the order of 27/n. The slow beats 
here provide an interesting analog to the echo envelope 
modulation slow beats previously observed,in nuclear 
magnetic resonance’! for resonances exhibiting mul- 
tiplet structure due to the indirect nuclear spin-spin 
coupling. In that interaction between identical but 
chemically nonequivalent fnuclei, mixtures of singlet 
and triplet spin states bring about an interference be- 
tween spin states which is manifested by the slow echo 
beats. For quadrupole echoes the mixing of the m= +} 
states due to the Zeeman effect gives rise to the slow 
beats for the same reason. Note that the slow-beat Term 
IV vanishes for 2=0 or x (f—1), corresponding to the 
condition that Ho does not mix the m= +3 states so that 
they form two independent systems. This is analogous 
to the fact that there are no slow beats in the case of 
identical nuclei in the earlier magnetic resonance experi- 
ments, where two groups of like nuclei have different 
chemical shifts but have a zero indirect spin-spin inter- 
action. 


III. CHLORINE FREE INDUCTION SIGNALS 
FROM NaClO; 


The quadrupole coupling of chlorine in NaClO; is a 
standard case which has been studied by several ob- 
servers.!” The electric field gradient eg is assumed to be 
axially symmetric about the molecular bond (z axis) 
joining Na to Cl. Figure 3 shows the NaClO; unit cell, 
which has four chlorine nuclei with their axes of quan- 


16 McNeil, Slichter, and Gutowsky, Phys. Rev. 84, 1245 (1951). 

17 Wang, Townes, Schawlow, and Holden, Phys. Rev. 86, 809 
(1952); R. Livingston, Science 118, 61 (1953); J. Itoh and R. 
Kusaka, J. Phys. Soc. Japan 9, 434 (1954); Ting, Manring, and 
Williams, Phys. Rev. 96, 408 (1954). 
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Fic. 3. Crystal structure of NaClO;. The cubic lattice unit cell 
dimension is 6.57 A on a side. 


tization oriented along the body diagonals of the sub- 
cubic cells. There are two nonequivalent directions of 
Na—Cl axes with respect to Ho, and a pair of chlorine 
nuclei assigned to each of these directions is denoted by 
+ and —. The shapes of free precession signals due to 
the Cl** nucleus serve to confirm the expressions ob- 
tained in Eqs. (32) and (33-III). Specific cases are 
shown in Fig. 4, where an external Hp field is perpen- 
dicular to the cubic axis of symmetry and the rf field 
2H; is perpendicular to Ho. 

The lifetime of free precession following a pulse is 
determined primarily by magnetic dipolar broadening 
due to nearest sodium neighbors, which provide a root 
mean square local field 4o=0.80 gauss at chlorine sites. 
In order to determine 7 in (31), which is the effective 
mean square local field, the effect of 4» on each of the 
quadrupole m states must be evaluated, which requires 
a knowledge of the net parallel and perpendicular com- 
ponents of Ap with respect to the Na—Cl axis. Such a 
calculation yields T;*=0.48 millisecond, assuming that 
T:**=1/n?. The observed 7.* is shorter (~0.3 milli- 
second), because of possible temperature gradients, 
strains in the crystal, Zeeman broadening due to the 
earth’s field (particularly in powdered samples), and 
because of the additional but smaller broadening due to 
coupling among chlorine nuclei. The thermal relaxation 
time 7; scarcely affects the observed relaxation here, 
since 7;~23 milliseconds at room temperature. The 
observed signal following the first pulse is given by 
V=V,+ V_, where 


Vi. =2v3M 0 (cos6,) sin (V3 wilw cos64.) 


—1 
x | so ) cos[_2o(cos#4.) (3-+8)t/2 ] 
Bs 


+(=) cos{%s(cos#,)(3—B)1/21] 
B 


= 


Xexp(—#/2T:"), 
and the echo signal!® (for ¢>>1—7) is described by 


18In reference 2, the square superscript was omitted in error 
for the coefficients involving 64: 


(34) 


Vs=—2v3.Mo(cos6s) sin(V3wity cos6,) 


eas 1 2 
X sin?[V3wtw (cosd.)/ a (= ) 
Bs. 


X cos[ (cos) (3+8) (¢—27)/2] 


Bs+1y\? 
+( ) cost e(cose,)(3-B4)(t-21)/21] 


+ 
Xexp—[(t—27)?/2T2”+-/T2'*(Ho,¢) ]. (35) 


In terms of the angle g which Ho makes with respect to 
the 100 direction in NaClO;, cos0,=+/% cos(y+im) 
and sind, =[1— $ cos?(y+}) ]}, where @ is the angle 
H makes with the Na—Cl bond. Also 


B= (1+4 tan’6,)?. 
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Fic. 4. Oscillographic display of free precession signals due to 
Cl** in a single NaClO; crystal in the presence of a small magnetic 
field Ho. The upper trace shows free precession following a first 
pulse, and an echo at ‘=2r after a second pulse applied at t=r, 
where V3wity=2/2 for both pulses. A magnetic field of Hyo=11.4 
gauss is applied along the crystalline 1,0,0 direction (y=0). 
The signal shapes are described by Eqs. (34) and (35). The sweep 
length is 2.5 milliseconds. The lower trace illustrates the free 
precession following a single pulse for Hy=25 gauss along the 
crystalline 1/v2, 1/v2, 0 direction (g=45°). From Eq. (34) the 
signal shape, excluding attenuation, is given by 


V(t) <0.4 cos(1.9Qot) +1.6 cos (0.53Q¢) +2 cos(Mf). 
The sweep length is one millisecond. 
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Fic. 5. Echo envelope trace of 
slow beat modulation due to Cl 
in a single NaClO; crystal. Ho 
=11.8 gauss in the crystalline 
1,0,0 direction (g=0). The sweep 
length is 2.0 milliseconds. 


The static contribution to the relaxation time is in 
terms of 7.**1/(n?+8), and the dynamic con- 
tribution to the relaxation which excludes the effect of 
T;, but includes the effect of time varying local 
fields and spin-spin coupling, is lumped together in 
terms of T,”. For each setting of 7, with the spin 
ensemble at thermal equilibrium, the maximum of 
the echo amplitude is observed to be proportional to 
exp[ — (27)?/T2’"(Ho,¢) ]. The time constant T,’ is 
observed to be a function of g and Ho. The principal 
cause for this effect? in NaClO; will be shown in some 
detail in a later paper by two of us (E.L.H. and B.H.) 
to be the fact that the spin-spin coupling among the Na 
nuclei is modified considerably by the Zeeman splitting. 
A similar but weaker effect occurs due to coupling 
among Cl nuclei. The correlation time of local fields 
at Cl sites, due to the spin-spin coupling among Na 
dipoles, is therefore modified and a change in 7,’ is 
observed. 

Figure 5 illustrates the slow beats in NaClO3. An 
artificial distribution of eg is imposed by introducing a 
temperature gradient across the sample. Otherwise the 
free induction signal due to the signal (32) following 
the first pulse would interfere with echoes exhibiting 
the echo slow beat, which only lasts for a time 27~21/n. 
This is also approximately the natural lifetime of the 
free induction tail. The lifetime of the echo is relatively 
unaffected by a gradient in eg. Hence the echo remains 
visible whereas the induction signal is now sufficiently 
attenuated due to decay factor involving 6(6>>n) and 
does not interfere with the echo. Now terms (33-III) 
and (33-IV) are applied to the case of NaClOs, and con- 
tributions from the two nonequivalent positions of Cl 
in the unit cell must be added. For the case where Hp is 
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oriented along the 1,0,0 axis, the echo envelope slow 
beat, including only the decay due to dipolar broaden- 
ing, is predicted to be 


V (2r) « (5/9)+ (8/9) cos(V3Qor) exp(—3n?7?/2) 
+ (2/9)[1+cos(2v3Qor) ] exp(—6n?r) ], 


which is confirmed by Fig. 4. Other orientations of Ho 
give a more complicated pattern. The slow beats are 
also observed in a crystalline powder, where the beat 
structure also persists for a time ~27/n, even though 
the electric field gradients are oriented in random 
directions with respect to Ho. 


IV. APPARATUS 


The apparatus for production and detection of quad- 
rupole free precession signals is similar to that used 
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Fic. 6. Block diagram of apparatus for obtaining nuclear quad- 
rupole free precession. 
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Fic. 7. 30 Mc/sec pulsed oscillator for obtaining free nuclear quadrupole precession. 


previously in observing spin echoes in nuclear magnetic 
resonance. When zero-field quadrupole signals are to 
be observed, however, it is essential that the receiving 
coil be coaxial with the transmitting coil—the trans- 
mitting coil may actually serve as the receiving coil—for 
reasons discussed previously. A block diagram, Fig. 6, 
shows the apparatus used. The transmitter (circuit 
diagram'in Fig. 7) consists of a pulsed oscillator normally 
held at cutoff and brought into oscillation by negative 
gating pulses. The gating unit triggers the oscilloscope 
sweep and, after suitable delay, provides several nega- 
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tive gates of equal but variable width and variable time 
separations. 

In free precession experiments the two receiver 
problems are: (1) adequately short recovery time of 
receiver sensitivity following intense transmitter pulses 
to allow observation of induction signals of short life- 
time; (2) sufficient signal to noise ratio for reception 
of weak nuclear signals. Experience has shown that a 
tuned rf receiver consisting of a low noise preamplifier 
followed by a broadband radar i.f. strip with crystal 
detector suffices for experiments carried out in the 
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Fic. 8. Preamplifier and converter for detection of free nuclear quadrupole precession. 





MAGNETIC INDUCTION 


Fic. 9. Oscillographic display of 
spin echo signals at 150 mc from 
Br® in powdered NaBrO; due to 
the application of three rf pulses at 
t=0, t=r, and t=T. The sweep 
length is 2 milliseconds, r=0.27 
millisecond, and T=0.71 milli- 
second. A magnetic field of Hp= 10 
gauss is applied to enhance the 
echo signals (see reference 2). 


region of 30 Mc/sec, and has adequate recovery time. 
However, the problems of instability and of tuning, as 
well as the difficulty of adapting the tuned rf system to 
higher resonant frequencies, such as the 150 Mc/sec 
resonance of Br® in NaBrOs;, has led to the use of a 
modified television superheterodyne tuner. 

Figure 8 shows such a converter modified for quad- 
rupole resonance detection in the 30 Mc/sec region. The 
initial 6 Mc/sec band pass of the television tuner was 
reduced by removing the damping resistor across the 
input coil. The input, oscillator, and mixer coils 
(mounted on a plastic bar “channel tuning strip”) were 
trimmed to the desired frequency with condensers. To 
reduce recovery time of the mixer input, a “rf test 
point” (indicated on the commercial tuner) below the 
10K resistor in the grid circuit was grounded. For best 
signal to noise ratio adjustment, separate B+ supplies 
for local oscillator and mixer were provided. The mixer 
output coil was placed directly in the 17 Mc/sec i.f. 
strip to simplify tuning. The receiver has a sensitivity 
of ~1.0 microvolts for a signal to noise ratio of unity, 
and a recovery time of ~20 microseconds when intense 
rf pulses are applied to the input. 


V. CONCLUDING REMARKS 


An initial study has been presented of transient 
nuclear induction in systems of quadrupole moments in 
which nuclear polarization and precession is principally 
determined by nuclear quadrupole coupling to an axial 
electric field gradient. Various effects pertaining to 
nuclear relaxation times and crystal structure studies 


IN NUCLEAR RESONANCE 


by, means of Zeeman structure analysis*remain to be 
investigated in many crystals. The pulsed spin echo 
method lends itself particularly to a clear-cut means of 
separating the various contributions to spin relaxation: 
(a) thermal spin relaxation times 7; (0) static spread 
in the line width in terms of a transient induction decay ; 
and (c) dynamic fluctuations of local fields in the lattice 
causing echo envelope decay. A study of these various 
contributions as they are influenced by changes in 
parameters such as temperature and pressure and, for 
example, the relaxation behavior under the influence of 
acoustic vibrations should yield interesting results. The 
method of double nuclear resonance can be applied to 
nuclear quadrupole free induction studies to reveal new 
information regarding nuclear spin coupling’ and 
relaxation. Cases in free precession remain to be studied 
in which the electric field gradient is assymetric. The 
effect of the asymmetry upon the beats due to the 
Zeeman effect should be of interest. 

Two of us (E.L.H. and B.H.) wish to thank Mr. 
Blume, Mr. Jones, and Mr. Kiselewsky for their ex- 
cellent work and assistance in the construction of the 
apparatus. We thank Dr. T. Wang for discussions and 
Dr. A. Schawlow of Bell Laboratories for provision of 
an NaClO; single crystal. Miss Alice Waara helped in 
preparing the drawings. One of us (M.B.) wishes to 
thank Professor C. P. Slichter and Professor R. E. 
Norberg for many valuable discussions. 


19 B. Herzog and E. L. Hahn, Bull. Am. Phys. Soc. 29, No. 7, 11 
(1954); E. L. Hahn and B. Herzog, Bull. Am. Phys. Soc. 29, 
No. 8, 17 (1954). 
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Simultaneity in the Compton Effect* 


Z. Bay, V. P. HENRI, AND F. McLERNON 
The George Washington University, Washington, D. C. 


(Received October 22, 1954) 


The simultaneity in the Compton effect has been investigated by using scintillation counters and fast 
coincidence techniques. By the use of a gamma-ray source of a known, very short, lifetime (Ni®) it was 
possible to apply a first moment investigation to the time delays involved. 

It is shown that the order of magnitude of all possible time delays involved in the Compton effect does 
not exceed 10" second. Upper limits previously given were ~10-® second. 





N 1950, Hofstadter and McIntyre, and Cross and 
Ramsey' repeated the Bothe-Geiger experiment? 
(1925) under improved conditions of time resolution 
(1.5X10-* second). Experiments reported in this paper 
lower the limit of such time measurements by three 
orders of magnitude. 

The ability to obtain this shorter time limit is due to 
two facts: (1) the time resolution of our technique is 
about one order of magnitude better; (2) the use of a 
gamma-ray source of a known, very short, mean life 
(Ni®) permits one to apply a first moment investigation 
which is much more accurate in coincidence technique. 
This gives an additional improvement of two orders of 
magnitude. 

It has been shown previously’ that, in the cascade of 
Co®—Ni® the 6 particle and the Compton recoil 
electron released by the gamma radiations (y) of Ni® 
appear within a time interval of less than 10~" second. 
This result was used to providean upper limit for the 
decay times involved. More precisely, it demonstrates 
that 6+(t.)<10-" second, where 6 is the mean life 
of the excited state of Ni® (the fact that Ni® has two 
excited states and therefore emits two gammas is not 
important here), and (/,) is the average of the possible 
random time delays /, in the release of a Compton 
electron, e, by y. Thus (¢.) must be less than 10-" 
second, i.e., the emission of a Compton electron is 
“simultaneous with the incident gamma” within 10-" 
second. Since ¢, is always positive (the recoil cannot 
appear earlier than the 8), the probability of obtaining 
af. greater than n(f,) is certainly not greater than 1/n. 

The upper limit given here for (f,) is also valid for a 
constant time delay of the same amount in every 
Compton process, as discussed by Hoffman, Shenstone, 
and Turner. They point out that such a constant 
time delay could not be detected in a coincidence 
experiment involving only Compton electrons released 
by a primary and its scattered gamma and that, from 


* This work was supported by the joint program of the U. S. 
Office of Naval Research and the U. S. Atomic Energy 
Commission. 

1R. Hofstadter and J. A. McIntyre, Phys. Rev. 78, 24 (1950); 
W. G. Cross and N. F. Ramsey, Phys. Rev. 80, 929 (1950). See 
also these papers for a more complete list of references. 

& ? W. Bothe and H. Geiger, Z. Physik 32, 639 (1925). 
& * Bay, Henri, and McLernon, Phys. Rev. 97, 561 (1955). 
‘ Hoffman, Shenstone, and Turner, Phys. Rev. 50, 1092 (1936). 


previous experiments of Piccard and Stahel,® one can 
deduce an upper limit of 10~’ second for this constant 
time delay. 

In a separate experiment, to be described below, we 
measured coincidences between two Compton electrons, 
e and e’, released by the primary and the scattered 
gamma, ¥ and 7’. 

If there is a random time delay ¢ in the release of the 
7’ wave train and (7) is the mean time length of. this 
wave train, then the average time of appearance of e’, 
related to the time of appearance of the B, after subtraction 
of times of flight, is 6+-{/)+ (7)+ (é.). Here we make the 
reasonable assumption that the average of the possible 
random time delay in the release of e’ by vy’ is also 
(te). 

Coincidence measurements between e and e’ gave for 
the average of the time delays, ¢’, between the ap- 
pearance of these two electrons ((/)=(t)+(r)<1.5 
X10-" second. Thus both (¢) and (7) are less than 
1.5X10-" second. Since both ¢ and 7 are positive, the 
probability of getting a value for either of them greater 
than m times the average is less than 1/n. 

The experimental arrangement is shown in Fig. 1. 
Two small diphenyl acetylene crystals A and B (8X8 


WY) Lhglibe 














a| DIPHENYL ACETYLENE 


CRYSTALS 


Fic. 1. Experimental arrangement. 


5 A. Piccard and E. Stahel, J. phys. et radium 7, 326 (1936). 
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SIMULTANEITY IN COMPTON EFFECT 


Fic. 2. Integral pulse-height 
distributions (taken in coin- 
cidence with the outputs of 
the C circuit) in detector A. 





1) « SCATTERED x's 
2) + PRIMARY y's + LIGHT ABSORBER 





X20 mm‘) are separated by an Al absorber (800 
mg/cm?) each facing a 1P21 photomultiplier. The 
whole system is mounted as a movable assembly. 
Gamma rays from a Co™ source (~20 millicuries) are 
collimated by a Pb channel (6-mm diameter, 22-cm 
length). The position of the detector assembly can be 
changed in such a way that the y beam impinges on 
either crystal A or B. The Compton-scattered gammas 
(~90°), resulting from primary scintillation events in 
one crystal are detected in the other crystal. Preliminary 
experiments showed that the coincidences detected are 
overwhelmingly true Compton coincidences and that 
<1 percent of them are due to chance coincidences and 
to the [yy] coincidences of Ni®. The coincidence 
counting rate obtained (No~3 per second) was in 
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Fic. 3. Differential de- 
delay coincidence curves. 
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agreement with the value calculated for the experi- 
mental arrangement. This corroborates once more the 
result of the original Bothe-Geiger experiment showing 
that the electron and the scattered photon are emitted 
in the same elementary process. With independent 
probabilities for the emission of e and y’, No would 
have been smaller by more than two orders of 
magnitude. 

Delayed coincidence curves have been obtained (1) 
when crystal A was in the primary beam and B was 
excited by scattered gammas, and (2) in the reversed 
position. The time differences, (including times of 
flight of y’ and the possible time delays #’), change sign 
when passing from (1) to (2). 

In order to obtain the highest possible number of 
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coincidences, we used light absorbers for adjusting 
amplitude distributions as previously described.’ Fig- 
ure 2 shows the respective pulse-height distributions 
(taken in coincidence with the output of the C circuit) 
in detector A when excited by scattered gammas 
(average energy ~360 kev), and when excited by 
primary gammas and a proper light absorber is placed 
between crystal and photomultiplier. The rms difference 
of the corresponding ordinates is 2.6 percent. A light 
absorber of the same transparency as used in detector 
A was necessary to equalize the pulse-height distri- 
butions for the two radiations in detector B. To utilize 
higher counting rates in the coincidence measurements 
we use the D, output® to gate the output of the C 
circuit (counting rate No), obtaining thereby R(T), 
and then plot »(T)=R,(T)/No versus T. The linear 
portion of such a curve can be used to measure very 
short time delays.* When one changes from a “prompt” 
source to the source producing delayed events [proba- 
bility density function p(¢’) for the delay #’], the change 
Av of the ordinate at a chosen T is 


Av=1(p)dv/dT, (1) 


where 1: () is the average time delay [first normalized 
moment of p(i’) ]. In our case (reversing the excitations 
in the channels) no prompt source is needed and Ay is 
twice as large. Equation (1) thus gives 2u:(p), or 


6 Bay, Meijer, and Papp, Phys. Rev. 82, 754 (1951). 
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twice the average time delay. In Fig. 3 the entire 
delayed coincidence curve is shown for the case (1) 
when crystal A is in the beam, and only part of the 
delayed coincidence curve for the reversed case (2). 
We measured Av at T=0, recording ~1300 coincidences 
for each case and repeating the measurements so that 
all together ~13 000 coincidences were involved. The 
slope dv/dT was measured separately between T 
=—2.5X10-" second and T=+2.5X10-" second. 
The average time delay was found to be w1(p) = (541.5) 
X10" second, where the error is the standard deviation 
of the total set. The average time of flight calculated 
from the geometry was 4.5X10-" second. After 
subtracting this from yw, the remaining part is within 
standard deviation. Thus the average time delay 
between the emission of the two Compton electrons e¢ 
and é’ is (#’)=(#)+(r)<1.5X10™ second. 

As a summary, we may say that the emission of an 
electron and a scattered gamma in the Compton 
process is simultaneous with the incident gamma 
within a time of ~10~" second. In addition, the mean 
time length of the outgoing y’ wave train is also less 
than 10-" second. Quantum theoretical time un- 
certainties for the release of electrons from atomic 
bonds and the time length of the accompanying 
scattered-gamma wave train are smaller than 10-” 
second and are thus far below the limits of present 
techniques. 
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Connection Between Dirac’s Electron and a Classical Spinning Particle 


F. GirsrEy 
Department of Mathematics, University of Istanbul, Istanbul, Turkey 


(Received September 22, 1954) 


A connection is established between the classical and wave mechanics of a spinning relativistic particle. 
In the case of a plane wave comprising both positive and negative energies, the 4-dimensional stream-lines 
of the current density 4-vector coincide with the world-lines of a classical particle. Moreover, the wave 
function represents the spinor (or quaternion) which determines the generalized intrinsic frame of reference 
attached to the world-line of the classical particle. This provides a geometrical interpretation for the un- 
quantized Dirac field as well as for the total angular momentum. 


TTEMPTS have been made to establish a con- 

nection between Dirac’s electron in wave me- 
chanics and a classical relativistic spinning particle. 
Among others, Weyssenhoff! has sought properties of 
the spinning particle which were similar to the quantum- 
mechanical properties of the electron, without, however, 
explaining these similarities. Huang? has shown that a 
wave packet in Dirac’s theory moves in the first ap- 
proximation like a spinning particle in Weyssenhoff’s 


1J. Weyssenhoff, Acta Phys. Polonica 9, 1, 46 (1947). 
2K. Huang, Am. J. Phys. 20, 479 (1952). 


classical theory. More recently, de Broglie® finds a 
relation between Dirac’s electron and a Weyssenhoff 
particle by applying the W. K. B. approximation to 
the wave equation. 

The purpose of this note is to report some results 
connected with this problem which were obtained 
from a general geometrical theory of world-lines. In 
the case of a plane-wave solution of Dirac’s equation 
involving states of both positive and negative energy 


3L. de Broglie, La théorie des particules de spin 4 (Gauthier- 
Villars, Paris, 1952). 





DIRAC’S ELECTRON AND SPINNING PARTICLE 


in the absence of field, it is shown that the de Broglie- 
Bohm‘ pilot wave method of associating a fictitious 
classical particle with a given wave function leads to 
the following simple conclusions: 


(1) The 4-dimensional streamlines which satisfy the 


equation 
dxt/ds= py) 


coincide exactly with the world-lines of relativistic 
classical spinning particles (in the sense of Weyssen- 
hoff’s theory) which possess a spin of magnitude 4/2. 

(2) The wave function y is susceptible of a simple 
geometrical interpretation: it defines the Lorentz 
transformation which transforms a fixed frame of 
reference into the generalized Frenet frame of reference 
(proper frame) associated with each point of the 
stream-line. (The Frenet formulas in space-time were 
first introduced by Synge’ in connection with relativistic 
hydrodynamics.) 

(3) The generalized Darboux vector is an anti- 
symmetrical tensor 2,, in space-time. At each point 
it represents the generalized angular velocity of the 
proper frame along the stream-line. 

Being proportional to the total angular momentum 
of the wave or of its associated classical particle, it 
has a simple physical meaning and is related to the wave 
function through the relation: 


dy /ds=} (yy — 7’) Qu, 


where d/ds denotes differentiation along the stream-line. 

(4) For the plane wave under consideration the total 
angular momentum is constant. It follows that the 
corresponding classical particles have constant angular 
momentum and the Darboux tensor is constant along 
their world lines. This means that the latter are special 
helices in space-time. In fact they are helices for which 
the first and second curvatures are constant and the 
third curvature vanishes. The trinormal at each point 


4D. Bohm, Phys. Rev. 85, 166 (1952). 
5 J. L. Synge, Proc. London Math. Soc. 43, 376 (1937). 
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is constant and proportional to the spin density pseudo- 
vector of the wave. 

To get a physical picture of such 4-dimensional 
curves we consider the motion of a spinless charged 


.particle in a constant electromagnetic field. The corre- 


sponding world-lines, which have been studied by 
Taub,® also turn out to be 4-dimensional helices. If 
E?— H?<0 and E-H=0, then we obtain curves which 
are identical with Dirac’s stream-lines or the world-lines 
of spinning free particles. 

(5) The four basis vectors of the proper frame 
associated with a streamline are simply related to the 
wave function. The tangent and the trinormal are 
respectively proportional to the current-density vector 
vy. and to the spin-density pseudovector Pysy,W 
(ys=Y1v2v3¥4). The principal normal (acceleration 
vector of the classical particle) and the binormal are 
respectively equal to the real and imaginary parts of 
the vector Yy,y’, where y/=iy.¥* denotes the charge- 
conjugate wave function. We have thus a geometrical 
interpretation of four mutually orthogonal 4-vectors 
which arise in Dirac’s theory. 


In short, a new connection between relativistic wave 
mechanics and classical mechanics in the field-free case 
is obtained through a geometrical approach, using 
concepts of differential geometry and associating a 
spinor (or quaternion) at each point of the world-line 
with the proper frame at that point. The most con- 
venient tool for the study of the geometry of world 
lines seems to be the quaternion formalism by means of 
which 4-dimensional rotations assume a simple form. 
The corresponding tensor and spinor equations can 
be derived by application of rules developed by the 
author.” 

A detailed account of this work, which was submitted 
to the University of Istanbul in 1953 as a habilitation 
thesis, will be published later. 


6 A. H. Taub, Revs. Modern Phys. 21, 388 (1949). 
7F, Giirsey, Ph.D. thesis, London, 1950 (unpublished). 
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Compacted Scintillators 


L. REIFFEL AND H. V. Watts 
Armour Research Foundation of Illinois Institute of Technology, 
Chicago, Illinois 
(Received January 14, 1955) 


N investigation has been made of the properties of 
scintillators prepared by high-pressure compac- 
tion in an evacuated die using techniques similar to 
those employed in infrared analysis. A number of 
representative phosphor systems have been examined to 
date including NalI(Tl), KBr(Tl), anthracene and 
ZnS (Ag). Inall cases, activated microcrystalline powders 
were subjected to compaction pressures of the order of 
100 000 psi in a small double-piston die from which the 
air had been withdrawn by a mechanical vacuum pump. 
Pressures were held for periods from 5 to 15 minutes and 
then the die opened, and the material removed. Pre- 
cautions were taken with hygroscopic substances such as 
NalI(TI) to avoid water vapor during the entire pro- 
cedure. 

In successful compactions reasonably clear cylindrical 
masses are obtained with diameters of ~1.3 cm and 
thicknesses of ~0.4 cm. Such results have been obtained 
thus far for both NaI(Tl) and KBr(TI) although some 
light scattering within the compacted volume is still 
evident. It seems likely that the residual scattering will 
be greatly reduced with better dies operating at higher 
effective pressures and with more care in the preparation 
of the microcrystalline powders. Results have not been 
as promising with anthracene which, presumably as a 
result of chemical reaction under pressure, compacts to 
an almost opaque, highly discolored substance. ZnS (Ag) 
compacts in separated, somewhat translucent layers at 
the piston surfaces with the majority of the mass re- 
maining powdery. Again more efficient dies operating at 
higher effective pressures than our present die may 
produce better results. 

For compacted NaI(Tl) and KBr(T]), the scintilla- 
tion efficiency was determined by direct pulse-height 
comparison with single-crystal samples? of similar size 
using a DuMont K 1186 PM tube and Cs'*’ gamma rays. 
For KBr(T!) we consistently obtain ninety percent of 
the single crystal pulse height while for NaI(T1) we have 
obtained values ranging from 35 percent to 85 percent. 
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The spread for NaI(Tl) is due primarily to the in- 
adequacy of our techniques for handling the micro- 
crystalline powder. No detailed measurements have 
been made on scintillation decay time, but any differ- 
ence between the single crystals and the compacted 
material is certainly slight. 

The comparatively high efficiency of properly pre- 
pared compacted scintillators indicates that the plastic 
flow and lattice destruction accompanying the process 
does not interfere seriously with the gross effectiveness 
of energy transfer from the host crystal to the activator 
sites. Furthermore, it seems likely that afterglow phe- 
nomena associated with deep traps will be less pro- 
nounced with unannealed compacted scintillators than 
with single crystals. No attempts at annealing or other 
treatments to increase the scintillation pulse height have 
been made as yet. 

It appears possible that compaction techniques may 
ultimately provide effective means for producing large 
sensitive volumes while retaining most of the advantages 
of present inorganic scintillators. It is also possible that 
greater control of activator distribution will be obtain- 
able, making practical such highly concentration-sensi- 
tive systems as LiI(T]).* 

1M. M. Stimson, J. Chem. Soc. 74, 1805 (1952); U. Schiedt 
and H. Reinwein, Z. Naturforsch. 7B, 270 (1952). 


2 Scintillation grade material obtained from the Harshaw 


Chemical Company. 
3S. C. Curran, Luminescence and the Scintillation Counter 
(Academic Press, Inc., New York, 1953), p. 133. 


Infrared Absorption of Germanium near 
the Lattice Edge 


G. G. MACFARLANE AND V. ROBERTS 


Radar Research Establishment, Malvern, England 
(Received December 10, 1954) 


E have noticed that the absorption spectrum of 
a number of semiconductors, PbS, PbSe, PbTe, 
Mg.Sn, and Ge obeys a law of the form K « (hv—Ep)", 
where is either 2 or 3, at low values of the absorption 
constant K near the long wave edge of the main lattice 
absorption band. It appeared to us that such laws could 
be understood if the tail of the lattice edge absorption 
arose from optically “forbidden” transitions and that 
the energy Ey would then be closely related to the 
minimum energy gap. 

In a recent letter Hall, Bardeen, and Blatt! have 
calculated the optical absorption by valence electrons 
for “indirect” and “direct” transitions. For the 
“indirect” transitions they give a law of the form 
K « (hv—E){k2+h(m.-+m,)(hv—E)}, which is in 
agreement with our observations. In view of the quali- 
tative success of the theory it seemed to us important to 
obtain sufficient data to test it quantitatively, and for 
this purpose we have chosen to study germanium. As 
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published data on absorption in germanium? is inade- 
quate for our purpose, we have measured the absorption 
for a 60 ohm cm polycrystalline specimen using glass 
prisms to get the highest resolution. The method de- 
pends essentially on measuring the radiation intensity 
with and without the specimen present, deducing the 
reflection coefficient, R, from the constant value of the 
transmission coefficient at wavelengths well beyond the 
absorption edge, and using the measured thickness of 
the germanium plate to calculate the absorption con- 
stant. Due allowance was made for multiple reflections.” 
R was found to vary slightly with temperature as given 
in Table I. It should be noted that the value of R at 


TABLE I. Dependence of reflection coefficient on temperature. 
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room temperature is higher than the one obtained by 
Briggs,*® viz., 0.365. Absorption curves were taken at 
seven temperatures ranging from 4.2°K to 291°K. 
We have found that K can be well represented at low 
levels of absorption by a law of the form 


1 hyv— Eq—k6\?2 
ee a 
1-—e¢9T hv 





1 hv—Egtk@\? 
+ (——) | (1) 
e/T—1 hy 


Thus when K? is plotted against hy, as in Fig. 1, the 
points lie close to one straight line for Eg—k0<hv< Eg 
+k@ to a steeper straight line for hy>Eg+k0. This 
behavior is just what one expects if the absorption in 
this range is due to indirect transitions in which there is 
a marked change in momentum between the initial and 
the final states and in which direct transitions do not 
play a role. The first term in brackets then refers to 
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Fic. 1. Dependence of absorption constant on photon energy for 
60 ohm-cm germanium. 
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photon absorption with emission of a phonon of energy 
k@, the second to photon absorption with absorption of a 
phonon of energy #0, and Eg is the minimum energy 
gap. We find that if 9= 260°K the dependence of K on hv 
and on T follows closely the prediction of (1). The full 
lines in Fig. 1, calculated from (1), show how good the 
agreement is. The energy gap, Eg, is plotted against 
temperature in Fig. 2. 

From cyclotron resonance experiments‘ we know that 
in germanium the energy maximum of the valence band 
is at the origin, and equal energy minima occur along the 
eight [111] axes of momentum in the conduction band. 
Therefore, in the above interpretation £0, is the energy 
of the longitudinal acoustic wave with momentum in the 
[111] direction equal to the momentum k, of the elec- 
trons at the minlmum of the conduction band. From the 
known values of the elastic constants® and the theory of 
vibrations of the diamond lattice,® we estimate that the 
conduction band minima occur at a momentum of 
magnitude 6.210’? cm~!, which is about } of the 
momentum at the edge of the zone in the [111] di- 
rections. 

We have attempted to correlate the intrinsic carrier 
concentration »; deduced by Morin and Maita’ from 
drift mobility and conductivity, with the values of Eg of 
Fig. 2. We find 


ms=4.82X 10473 (m/mo) iN 4 exp(—Ee¢/2kT), (2) 


where 7%=0.25mp is calculated from the effective masses 
of holes and electrons‘ including two valence bands and 
N..=8 is the number of minima in the conduction band. 
This gives, with Eg=0.655 ev, m;=1.82X10" cm- at 
291°K, whereas Morin and Maita find ;=1.4X10" 
car. 

We have also estimated the magnitude of K theo- 
retically. The factor A in (1) depends on the effective 
masses of holes and electrons, on ,, 0, N., the refractive 
index, and the coupling constant C. This latter can be 
expressed in terms of the electron and hole mobilities. 
We find A theor= 1420 cm™, whereas the value of A 
which best fits the data is Agps=1150 cm™. Although 
this close agreement may be fortuitous, we consider that 
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it does indicate that the magnitude predicted theo- 
retically is of the right order. 

Finally, we observe that using (2) with »=0.21mo, 
which brings the values of ; of (2) and of Morin and 
Maita into line at 291°K, and values of ; given by 
Morin and Maita at temperatures above 291°K, we have 
deduced Eg at higher temperatures. The results are 
shown dotted in Fig. 2, from which it will be seen that 
the two branches fit together with very little discon- 
tinuity of slope. We would remark on the quadratic 
behavior of Eg at low temperature. 

Acknowledgment is made to the Chief Scientist, 
British Ministry of Supply, for permission to publish 
this letter. 
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Ultrasonic Attenuation in Metals by 
Electron Relaxation 


R. W. Morse* 
Royal Society Mond Laboratory, Cambridge University, 
Cambridge, England 


(Received December 9, 1954; revised manuscript received 
January 13, 1955) 


DIFFERENCE in the ultrasonic attenuation be- 
tween lead in the normal and superconducting 
states has been reported recently by Bémmel.! In this 
note it will be shown that the magnitude and tempera- 


ture dependence of the attenuation in the normal state - 


can be explained reasonably in terms of an incomplete 
adjustment of the Fermi distribution with respect to the 
elastic deformation, and consequently such an attenua- 
tion is to be expected in all metals at low temperatures 
when the mean free path becomes relatively long.” 

In the free electron gas model of a metal, the Fermi 
surface is a sphere. With this gas we can associate an 
internal kinetic pressure given by ps= (2/5)nEo, where 
n is the number of electrons per unit volume and £p is 
the Fermi energy.’ If a longitudinal compressive strain 
ez (in the x-direction) is produced slowly, the Fermi 
surface remains spherical and #; increases uniformly 
because the volume decreases. On the other hand, if e, 
is brought about quickly enough, only the electron 
velocity components in the x-direction react immedi- 
ately and the Fermi surface is elongated momentarily in 
that direction. Collisions of electrons with the lattice 
eventually lead to the equilibrium spherical distribution, 
and the stress necessary to maintain e, relaxes to its 
equilibrium value. The relaxation time 7 of this process 
is the same as the one commonly used in the theory of 
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electrical conductivity. The magnitude of the relaxing 
part of the stress Ap, can be found simply by noting 
that the x-component of velocity is increased by a 
factor (1+,), in the instantaneous application and by 
the factor (1+4e.) in the equilibrium case. In both 
situations m increases by (1+-e,), and so Ap, is found to 
be (8/15)nEyez=bez. A more detailed analysis also 
gives this result. 

Such an effect may be expressed in terms of a relaxa- 
tional elastic constant 


k= ko(1+b/ho)[1+b/ko(1+iwr) J. 


The attenuation constant a is the imaginary part of 
—w(po/k), po being the density. When w7<1 we obtain, 
after expressing the relaxation time in terms of o, the 
electrical conductivity, and neglecting 6 with respect 


to Ro, 
4 wmEoo 
a=— ° (1) 
15 pocore” 


Here, ¢o is the longitudinal wave velocity, e is the elec- 
tron charge, and m its mass. 

Van den Berg‘ observed that o of lead in the normal 
state is given approximately by 1/¢=p’+6.6X10-* T® 
ohm-cm, where p’ is the residual resistance. Assuming 
that curve R of Fig. 1 (which also shows Bémmel’s 
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Fic. 1. Attenuation vs temperature in lead. The solid curves 
show Bémmel’s measurements; curve R is assumed due to some 
other mechanism, and the crosses are calculated from Eq. (1). 


results) is an attenuation due to some other cause, 
Eq. (1) can be used to evaluate p’ by fitting (a,—R) at 
the lowest temperature. Using Ey>=4 ev and co=2.4 
X 10° cm/sec we obtain p’ = 1.0 10-* ohm-cm (a plausi- 
ble magnitude). Attenuation at higher temperatures 
may be calculated using Van den Berg’s results for o and 
points are shown in Fig. 1. 

The fair agreement between Eq. (1) and Bémmel’s 
results in spite of the simple model used, lends support 
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to the original assumption of a relaxation of the Fermi 
distribution. Other consequences of this mechanism are: 
(a) The effect should be observed in all metals at low 
enough 7, provided p’ is small. (b) An absorption of 
comparable magnitude will occur for shear waves (as 
found by Bémmel) since a shear strain can be considered 
equivalent to simultaneous compressive and extensive 
longitudinal strains of the type discussed above. (c) 
Equation (1) provides no reasonable explanation of the 
rapid drop of the attenuation in the superconducting 
region, as long as only normal electrons are considered. 
This suggests that even a small number of supercon- 
ducting electrons has a large effect in speeding the 
equilibrium between normal electrons and the lattice. 
The writer thanks Dr. A. B. Pippard for valuable 
comments. He also expresses his appreciation to the 
Howard Foundation for making possible his stay at 
Cambridge. 


*On leave from Brown University, Providence, Rhode Island. 

1H. Bémmel, Phys. Rev. 96, 220 (1954). 

2 Since preparing this note it has been brought to the author’s 
attention that W. P. Mason has discussed Bémmel’s results in 
terms of the equivalent shear viscosity of the electron gas [Phys. 
Rev. 97, 557 (1955) ]; this discussion leads to results similar to those 
derived here. The approach used in the present paper clearly 
involves the same physical mechanism as does shear viscosity, and, 
at low frequencies, is an alternative way of considering the same 
effect. The two differ, however, at high frequencies in the same 
way that a relaxation type absorption differs from a viscous 
absorption. It is felt that the present approach gives a somewhat 
clearer picture of the mechanism involved. 

3J. E. Mayer and M. G. Mayer, Statistical Mechanics (John 
Wiley and Sons, Inc., New York, 1940), p. 386. 

4G. J. Van den Berg, Physica 14, 111 (1948). 


Superfluidity in Unsaturated Helium Films 
above the 1 Temperature 


Earzt Lonc AND LOTHAR MEYER 
Institute for the Study of Metals, The University of Chicago, 
Chicago, Illinois 
(Received January 31, 1955) 


IRECT flow observations! as well as heat con- 
ductivity measurements?:* have shown that super- 
fluidity appears in unsaturated helium films at a given 
temperature only above a critical value of the saturation 
P/Po, P being the gas pressure in equilibrium with the 
film, Po the vapor pressure of the bulk liquid. By means 
of the adsorption isotherm the values of P/Po» can be 
transformed into film thicknesses, eventually expressed 
in numbers of statistical layers. 

An analysis of detailed heat conduction experiments 
on unsaturated films to be published soon showed that 
at 1.3°K about 2 layers are not showing superfluidity, 
| this number increasing to 10 layers at 2.0°K. An 
extrapolation to the \-temperature led to the result that 
about 20 layers should be immobile at this temperature. 
Since the film thickness can exceed 20 layers appreci- 
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ably‘ and thermodynamic considerations suggested the 
possibility that superfluidity can occur in unsaturated 
films above 7},*:5 we investigated therefore whether or 
not films of more than 20 layers would exhibit super- 
fluidity at or above 7). 

The heat conduction apparatus of reference 2 was 
used. The bath temperature was kept at 14$+0.2 
millidegrees below the A-point using P,= 38.1 mm Hg,®” 
corresponding to 2.183°K of the conventional scale, and 
dP/dT=0.094 mm Hg/10- deg. 

Without contribution of superfluidity the conductance 
of the apparatus is ~10u watts/deg, and its capacity 
~1X107 joule/deg so that 0.3-uwatt heating produces 
a warming rate at the warm end of ~2 millidegrees/ 
minute as established at P/Py~0.8 corresponding to 
about 7 layers. Using then (a) P/Po=0.9996 (Poath 
— Paysten=0.3 mm oil, Pratn 560 mm oil), estimated 
film thickness ~30 layers, (b) about 15 percent more 
gas in the apparatus than necessary to make Peystem 
=Pratn, (estimated film thickness ~50 layers), we 
obtained the following result: 

Up to 2 uwatts heat input did not produce any 
measureable temperature difference up to temperatures 
of 2.185° of the conventional scale, i.e., 2 millidegrees 
above the real A-point. Higher heating rates up to 
7.3 wwatts did produce increasing temperature differ- 
ences. However the warming rate with these heat inputs 
was up to 2.195°K of the conventional scale only a 
fraction of that derived from the experiments at 
P/Po~0.8 (without contribution of superfluidity) sug- 
gesting that superfluid heat transport is active up to 
these temperatures. 

The results seem therefore to indicate that films of 
more than 20 layers are showing superfluid behavior at 
and above the A-point of the bulk liquid. A detailed 
account will be given in the forthcoming paper. 

1E. Long and L. Meyer, Phys. Rev. 85, 1030 (1952). 

2 E. Long and L. Meyer, Phys. Rev. 87, 153 (1952). 

3 E. Long and L. Meyer, Phil. Mag. Suppl. 2, 18 (1953). 

4R. Bowers, Phil. Mag. 44, 1309 (1953). 

51. Meyer and E. Long, Phys. Rev. 85, 1035 (1952). 


6 E. Long and L. Meyer, Phys. Rev. 83, 860 (1951). 
7R. A. Erickson and L. D. Roberts, Phys. Rev. 93, 957 (1954). 


Electronic Density of States of Graphite* 


Joun E. Hove 
Nuclear Engineering and Manufacturing, North American Aviation, 
Inc., Downey, California 
(Received January 10, 1955) 


T has been shown by Carter and Krumhansl! that 
the electronic density of states of graphite near the 
top of the filled band is asymmetric about the energy 
where this band touches the next, unfilled, band. This 
conclusion is based on a modification of the Wallace? 
band structure. This modification consists of recognizing 
the difference in the number of neighbors between the 
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four atoms in the unit cell and thus assigning two of 
Wallace’s diagonal matrix elements a different value 
than the other two. The discontinuity of the density of 
state at the band edge is proportional to the difference 
(H1:—H 2») and if this is large enough (greater than 271) 
the bands need not touch at all. It should be mentioned 
that asymmetry in the energy contours is a necessary 
feature to explain the sizeable negative Hall coefficient.* 

The purpose of the present note is to point out that 
the effect of including next-nearest neighbors (in the 
basal plane) is to introduce an asymmetry in the density 
of states in the same qualitative fashion as above. Next- 
nearest neighbors in the plane are easily taken into 
consideration and, in fact, Wallace? has already done so, 
although he neglects such terms when calculating the 
density of states. From reference 2, the energy of a state 
k, including nearest and next-nearest neighbors in the 
plane and nearest neighbors out of the plane, is: 


e= —71 costck,b[y1? cos*hck,+3y0?a"k 2,7 |! 
—fy0'a%x:,?. (1) 
Here xz,?=x2+x,?2, where x=k—k(corner), € is 
measured from the band edge, and 7, yo’, and 1 are the 
resonance integrals involving coplanar nearest and next- 
nearest and interplanar nearest neighbors, respectively. 
The only effect of including yo’ is a term in xz,”. In the 
two-dimensional approximation (y1=0) the yo’ term 
does not affect the density of states to a first order. 
However, when 7; is retained, the terms in yo’ and in yo 
are of the same order for small.x,, and the density of 
states curve is altered markedly. The calculation for 
this case can be readily performed with the assumption 
that yo’vyo, which is certainly true. The result for the 
density of states per atom to a second order in the 
energy is given below, where no=o'/7o and m=71/‘o. 


(1) For 0<e<2y1, 


€ 


T 
[1+ Snot (et 12n0m) 
2 211 


Niece 
AC3 = 
e 


+(£) atom} (2) 


271 


(2) For —2y1<e<0, 
N(.)= 


271 T | e| 
1——nom+—(4—12nom) 
rye 2 271 


+(<-) d-6rmma] (3) 


The effect of yo’ is seen in Fig. 1. There is a dis- 
continuity in V(e) at zero and both its value and slope 
are less for the lower band. This is the same qualitative 
feature found by Carter and Krumhansl. The next- 
nearest neighbor effect may be of equal or greater im- 
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Fic. 1. The graphite density of states (the solid line is for yo’ =0). 


portance than the effect of coordination number, al- 
though estimates of relative magnitude are difficult to 
make. It is, however, significant that these two modifica- 
tions (dissimilar in nature) have similar effects on the 
density of states curve. This is probably a general 
property of the graphite lattice symmetry in that if the 
correct matrix elements were expanded in a Fourier 
series the above asymmetry would generally appear. 
The coefficients in the series would be disposable con- 
stants and not directly identifiable with any particular 
overlap integral by itself. The effects considered above 
and by Carter and Krumhansl would, however, be a 
major part of the first order coefficients. This change in 
the density of states is in the correct direction to explain 
the negative Hall coefficient. 

The author would like to acknowledge discussions of 
this topic with J. A. Krumhansl. 

* This note is based on studies conducted for the U. S. Atomic 
Energy Commission. 
1983)" Carter and J. A. Krumhansl, J. Chem. Phys. 21, 2238 


2P. R. Wallace, Phys. Rev. 71, 622 (1947). 
3G. Hennig, J. Chem. Phys. 20, 1438 (1952). 


Superconducting Transitions in Tin 
Whiskers* 


O. S. Lutes, National Bureau of Standards, Washington, D. C., and 
Department of Physics, University of Maryland, 
College Park, Maryland 


AND 


E. MaxwE t,t National Bureau of Standards, Washington, D. C, 
(Received January 19, 1955) 


EASUREMENTS have been undertaken to de- 
termine the form of the superconducting resist- 
ance transition in tin whiskers.! These small filaments 
are of interest in the study of size effects in supercon- 
ductors because they are an order of magnitude smaller 
than used in previous such studies.? Our preliminary 
results differ markedly from those for larger wires. 
Transitions were obtained at several temperatures 
using a transverse magnetic field; i.e., the restoration of 
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resistance at a given temperature was studied by in- 
creasing a magnetic field applied perpendicularly to the 
specimen axis. The whisker, of diameter 1.2X10~ cm, 
was mounted on a Pyrex plate using silver paste con- 
tacts. The effective sample length was approximately 
50X10 cm. The temperature was controlled by a liquid 
helium bath and the resistance measured with a Mueller 
bridge. The sample current was less than 5 ya. 

Results at three different temperatures are shown in 
Fig. 1. The data are plotted on a reduced basis, H, being 
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Fic. 1. Plot of whisker transition showing fraction of normal 
resistance restored as a function of the reduced field variable, 
o—3.65°K, * —3.60°K, O—1.69°K. 


the critical field for a bulk specimen of natural tin® in a 
longitudinal field. The unusual behavior of the transi- 
tion is most pronounced at 1.69°K, where an abrupt 
(less than 0.02 percent of H, in width) transition occurs 
at H=0.67H,. At this temperature there is no apparent 
evidence of the intermediate state. As the temperature 
is increased towards the transition temperature of bulk 
tin (3.73°K), the intermediate state appears and occu- 
pies an increasingly wide range of field. It should be 
remarked that a residual resistance was measured in the 
superconducting region. This was attributed to the 
contacts and was subtracted from the total. 

For purposes of comparison the results obtained by 
Andrew? for his largest and smallest wires at 1.66° are 
shown in Fig. 2, together with our result at 1.69°. The 
intermediate state is observed for all of Andrew’s wires. 
The value of H/H, at which intermediate resistance 
first appears is designated as p. For Andrew’s larger 
wires p is nearly 3, in keeping with the demagnetizing 
coefficient of a transverse cylinder. As the wire diameter 
is reduced p increases, becoming 0.64 for a diameter of 
27X10 cm. 

Approximate theories for the intermediate state in 
transverse wires have been developed by Andrew‘ and 
Kuper,® using concepts originally stated by Landau.® 
The intermediate state, when it exists, consists of alter- 
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nating superconducting and normal domains, so shaped 
that the overall free energy is lower than for either the 
normal or the superconducting state. The delayed onset 
of the intermediate state, in the thinner wires, is 
attributed to the positive surface energy of the inter- 
phase boundaries. As the wire diameter is reduced, the 
domain segmentation, hence the relative contribution of 
the surface energy, increases and the intermediate state 
is characterized by a higher free energy. Consequently 
the free energies of superconducting and intermediate 
states become equal at greater H/H,. Thus p increases. 


-Kuper’s theory gives p=}+0.42(A/r)!, where r is the 


cylinder radius and A the surface energy parameter, 
having the dimension of length. 

We now offer a possible explanation for the observed 
transition of the whisker. With reduction of wire diame- 
ter, p does not increase indefinitely but approaches a 
limiting value. This follows since at some field, between 
H,/2 and H,, the free energies of normal and supercon- 
ducting states must be equal. Thus, as Andrew‘ has 
pointed out, if the intermediate state has not appeared 
when this field is reached, the specimen will prefer a 
direct transition from the superconducting to the normal 
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Fic. 2. Comparison of whisker transition with results of Andrew 
for larger wires. 1—Andrew, 1.05107! cm diameter, 1.66°K. 
2—Andrew, 27 X 10-4 cm diameter, 1.66°K. O—whisker, 1.2 10-4 
cm diameter, 1.69°K. 


state. Hence no intermediate resistance will be observed. 
In the general case of an ellipsoid the limiting value of p 
is easily shown to be (1—2)!, where m is the demag- 
netizing coefficient. Applying this to the transverse 
cylinder (n= 4), one would not expect to observe transi- 
tions from the superconducting to the intermediate 
state at values of H/H,.>0.71. Kuper’s result gives 
p=0.71 for r<16A. Thus, for r<16A, restoration of 
normal resistance should be abrupt, and should occur at 
H/H,=0.71. Since A~10-> cm,’ the discontinuous 
transition of the whisker appears reasonable. The occur- 
rence of the transition at H/H.<0.71 might be ex- 
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plained by a somewhat noncircular cross section. The 
appearance of the intermediate state at the higher 
temperatures could be due to a decrease of A. Such a 
temperature dependence would, however, be contrary to 
the usual interpretations of other experiments.’ Further 
investigations are in progress. 

* This research was supported by the United States Air Force, 
through the Office of Scientific Research of the Air Research and 
Development Command. 

t Permanent Address: Lincoln Laboratory, Massachusetts In- 
stitute of Technology, Cambridge 39, Massachusetts. 
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Anomalous Longitudinal Magnetoresistance 
of Metal Single Crystals 


M. C. STEELE 


Naval Research Laboratory, Washington, D. C. 
(Received January 26, 1955) 


EASUREMENTS of the longitudinal magnetore- 
sistance of pure (99.92 percent) antimony single 
crystal plates have been made in fields up to 60 kilogauss 
at several temperatures ranging from 1.5°K to 300°K. 
The results for a representative rectangular crystal plate 
(10 mm long, 2.5 mm wide, and 0.69 mm thick) are 
shown in Fig. 1 where the change in resistance (AR) 
divided by the zero field resistance (Ro) at the particular 
temperature is plotted as a function of magnetic field 
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Fic. 1. Change of resistance for an antimony single crystal in a 
longitudinal magnetic field. 
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strength (H). The trigonal axis of the crystal was 
perpendicular to the wide face of the plate while a binary 
axis was parallel to the width. The measuring current 
and H were both parallel to the length of the plate, 
Current and potential leads were attached to the crystal 
with low melting point solder. Potential measurements 
were made to 10-7 volt by means of a five-dial po- 
tentiometer. The anomalous behavior is most clearly 
seen in the low-temperature field sweeps (1.52°K and 
4,.21°K). At these temperatures the value of AR/Ro not 
only exhibits a maximum, but also passes through zero 
and becomes negative. At the strongest field (60 
kilogauss) the resistance was 56 percent less than the 
zero field resistance. The maximum in AR/R» was still 
clearly observable in the 52°K run, but at 78°K the 
curve seems to be approaching either a maximum or 
saturation at 60 kilogauss. Room temperature data 
(300°K) show no unusual behavior. 

The changing shape of the curves as the temperature 
increases suggests that there are at least two conduction 
mechanisms responsible for the totality of data. At high 
temperatures (300°K) the curve exhibits the parabolic 
field dependence that might be expected for an aniso- 
tropic conductor.’ At low temperatures it would seem 
that there is a second mechanism which decreases the re- 
sistance as H increases. Such a mechanism may be 
similar to that proposed by MacDonald? to explain the 
results of his experiments with fine sodium wires (diame- 
ter less than mean free path). His explanation is based 
on the decrease in scattering from the surface when the 
classical orbit radius of the electron, due to the applied 
magnetic field, becomes comparable to the specimen 
size. Since the thickness of the plate in the present ex- 
periment is at least ten times the diameter of the wires 
used by MacDonald, it seems likely that the magnitude 
of the bulk electronic mean free path relative to both 
the specimen thickness and the classical orbit radius will 
enter into the description of the phenomena. The fact 
that the maximum is still observable at 52°K suggests 
that there might be a dimension less than the thickness 
but greater than atomic dimensions, which is a measure 
of some internal cleavage plane spacing. Further experi- 
ments are underway to resolve this question and the 
dependence upon crystal orientation. 

A phenomenological explanation of the high-tempera- 
ture data (bulk effect only) based upon the electronic 
ellipsoid scheme for antimony derived from magnetic 
susceptibility® data is being attempted and will be re- 
ported at a later date. 

The author wishes to express his thanks to Dr. E. I. 
Salkovitz for making available the antimony crystal, 
and to Dr. J. I. Kaplan for stimulating discussions 
relating to the problem. 

1 See, for example, A. H. Wilson’s, Theory fi ae etals (Cambridge 
University Press, London, England, 1953), 


Jy SE ol MacDonald, Nature 163, 637 oh 
3D. Shoenberg, Trans. Roy. Soc. (London) A245, 1 (1952). 
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Theory of Donor Levels in Silicon 


W. Koun, Carnegie Institute of Technology, Pittsburgh, 
Pennsylvania and Bell Telephone Laboratories, 
Murray Hill, New Jersey 


AND 
J. M. LurrincEr, University of Michigan, Ann Arbor, Michigan 
and Bell Telephone Laboratories, Murray Hill, New Jersey 
(Received January 3, 1955) 


E have extended our work on the ground state of 

a donor electron in Si! to estimate the positions 

of the low-lying excited levels. Our calculations are 
based on the following model: 

(1) The conduction band has 6 minima in the (1,0,0) 
and equivalent directions.? At each minimum the band 
is nondegenerate. 

(2) The effective masses are m,=0.19m (twice), 
ms=0.98m.? ' 

(3) Except in the immediate vicinity of the donor 
atom the donor states are described by functions of the 
form 


6 
V= ¥ al FO (r)y(k; vr), 
j=1 


where the F‘(r) are modulating functions satisfying 
appropriate effective mass equations, the ¥(k‘,r) are 
the Bloch functions at the 6 minima k‘ of the conduc- 
tion band and the a‘ are constants satisfying the re- 
quirements of tetrahedral symmetry. 

(4) Shifts of the energy levels relative to their values 
in the effective mass theory are attributed to failure of 
the effective mass formalism in the vicinity of the donor 
atom.! From the known shift of the ground state, the 
shifts of the other levels are estimated. 

Table I contains our results. We have included the 
level positions as calculated from the effective mass 
Schrédinger equation, and the corrected level positions 
for P, As, and Sb donors, where allowance for the partial 
breakdown of the effective mass formalism has been 
made. 

The effects of lattice vibrations have not been 
included. 
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Optical transitions from the ground state will take 
place primarily to the p-states. 

A detailed report is being submitted to the Physical 
Review. 

We wish to express our thanks to the staff of the Bell 
Telephone Laboratories, where this work was begun, for 
their hospitality and to Dr. R. C. Fletcher, Dr. C. 
Herring, and Dr. G. Wannier for many stimulating 
discussions. 

1J. M. Luttinger and W. Kohn, Phys. Rev. 96, 802 (1954) and 


Phys. Rev. 97, 1721 (1955). 
2R. N. Dexter e¢ al., Phys. Rev. 96, 222 (1954). 


Thermoelectric Power of Germanium 
at Low Temperatures 


E. Mooser* Anp S. B. Woops 


Division of Physics, National Research Council, Ottawa, Canada 
(Received January 28, 1955) 


UREVICE has pointed out that the lattice vibra- 
tions in a metal under a temperature gradient tend 
to scatter the electrons preferentially toward the colder 
end of the sample. This effect should create an addi- 
tional term in the thermoelectric power, Q, which may 
then be written 
Q ae O-+Q>, 


where Q, is due to the usual electron diffusion and Q, 
arises from the “phonon drag” mentioned above. Q, has 
not so far been detected in thermoelectric power 
measurements on pure metals (see also. MacDonald, 
Pearson, and White ;? MacDonald).’ On the other hand, 
measurements of Q made on germanium show an 
anomalous increase below 200°K and to explain these 
results, Frederikse,* Herring,’ and MacDonald® inde- 
pendently derived theoretical expressions for Q, in 
semiconductors. 

We wish to report here measurements which con- 
tribute evidence for the existence of such a term in the 
thermoelectric power of germanium. The two samples, 
51 and §2, used in these measurements, are of higher 
purity than those used by Frederikse® and by Geballe 


TABLE I. Level scheme of donor states in silicon. 








Number of degen- 


Representations® 
erate> states 


(Energy in ev) X102¢ 


Eff. mass theory P As Sb 





—2:9- 40.1 

—2.9 +0.1 

— 1.130.06 
—0.880.06 
—0.88+0.06 
—0.59+0.02 
—0.57+0.06 


Ai+E+T; 


—3.94 

=i 0:2 
— 1.130.06 
—0.94+0.08 
—0.90+0.08 
—0.59+0.02 
—0.57+0.06 


—4.94 

—3.3 +0.4 
—1.13+0.06 
—1.11+0.10 
—0.95+0.13 
—0.59+0.02 
—0.57+0.06 


—4,44 

—3.2 +0:3 
— 1.1320.06 
— 1.06+0.10 
—0.932:0.11 
—0.59+0.02 
—0.570.06 








+ Eyring, Walter, and Kimball, Quantum Chemistry (John Wiley and Sons, Inc., New York, 1944), p. 388. 
These states are only approximately degenerate, consisting in general of several strictly degenerate sets, as appears from the second column. Spin 


degeneracy is not included 


° The indicated errors represent estimated uncertainties within the framework of the present model. 


4 Experimental, 
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Fic. 1. Thermoelectric power of p-type germanium samples. 


and Hull’ in their work in the same temperature region. 
S1 is a polycrystalline p-type sample with a resistivity 
of ~30 ohm cm at 295°K. It is 2.23 mmX2.20 mm in 
cross section, 55 mm in length, and contains about six 
crystallites. S2 is an n-type single crystal with a re- 
sistivity of ~50 ohm cm at 295°K and has dimensions 
2.26 mm X 1.74 mm X39.5 mm. 

The samples were mounted in a cryostat described 
previously by White and Woods® and the thermoelectric 
voltage was measured with an electrometer (necessary 
because of the high sample resistance). One end of the 
sample was connected to the earthed cryostat, while the 
heater and thermometers were connected to the sample 
through junctions of resistance greater than 5X10° 
ohms. The electrometer, having an input resistance of 
~10" ohms, was connected to the sample at the 
thermometer junctions and was earthed only through 
the sample. 

The variation of the thermoelectric power of S1 with 
temperature is shown in Fig. 1 together with a curve 


taken from the results of Geballe and Hull for an indium- © 


doped p-type sample with a resistivity of 21.5 ohms at 
300°K. The general features of the two curves are 
similar and are in good agreement with Herring’s 
theory; the differences are almost certainly due to 
differing dimensions and impurity content. 


S2 
GEBALLE AND HULL 
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Fic. 2. Thermoelectric power of n-type germanium samples. 


Figure 2 shows the results obtained for sample S2 and 
a curve, taken from Geballe and Hull, for an n-type 
sample with a resistivity of 18.5 ohm cm at 300°K. It is 
evident that the two curves agree well down to about 
40°K, below which temperature the thermoelectric 
power of our sample decreases abruptly. Since the re- 
sults are reproducible within the experimental error, 
which is relatively large because of the small tempera- 
ture differences used (~2 percent of the mean sample 
temperature), we believe the effect is real, but it does 
not seem to be accounted for by the present theories. A 
satisfactory theoretical interpretation must await meas- 
urements of other properties of this sample, such as, for 
example, the Hall voltage and electrical resistivity at 
these low temperatures. 

We wish to thank Dr. D. K. C. MacDonald for sug- 
gesting this investigation and for his continued interest 
in it. 

* National Research Laboratories Postdoctorate Fellow. 

1L. Gurevich, J. Phys. (U.S.S.R.) 9, 477 (1945); 10, 67 (1946). 
— Pearson, and White, Bull. Intern. Inst. Refrig. 

3D. K. C. MacDonald, Inst. intern. phys. Solvay, 10th Conseil 
phys., ae (1954). 

R. Frederikse, Phys. ee 91, at ees, 248 (1953). 

ic. Plana Phys. Rev. 96, 1163 (19 

6D. K. C. MacDonald, Physica 22, 986 (195 


4). 
7T. H. Geballe and G. Ww. Hull, Phys. Rev. 94, 1134 (1954). 
8 G. K. White and S. B. Woods, Can. J. Phys. (to be published). 


Excited Donor Levels in Silicon* 


WALTER H. KLEINER 
Lincoln Laboratory, Massachusetts Institute of Technology, 
Lexington 73, Massachusetts 
(Received January 10, 1955) 


XCITED energy levels of a monovalent donor im- 
purity have been calculated approximately in the 
effective mass approximation taking account of ani- 
sotropy of the effective mass. The results lead to an 
identification of observed! infrared absorption lines in 
which the mass anisotropy is an essential feature. We 
use the effective mass approximation, in spite of the 
many and great difficulties involved in its theoretical 
justification, because its relative simplicity allows calcu- 
lations to be completed in a reasonable time and the 
results give at least qualitative insight. 
Our treatment is based on the effective-mass 
Schrédinger equation: 


Hy=Ej), H= (p2+p,)/2m,+p2/2myt+ V(r), (1) 


for the donor impurity electron in silicon, interaction 
between the six degenerate k values being neglected. 
m, and m,, are the transverse and longitudinal effective 
masses taken from experiment; for? Si, m,/mo=0.19, 
m,,/mo=0.98 and y= m,,/m, = 5.2. mo is the free electron 
mass; V is the difference between the potential energy of 
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the impurity electron in the crystal containing one im- 
purity ion and the potential energy in the perfect 
crystal. V(r) is here assumed spherically symmetrical. 
H then has cylindrical and inversion symmetry, and the 
z-component of the angular momentum is a constant of 
the motion. In the limit y=m,,/m,—1 the symmetry 
becomes spherical. 

As a calculational and conceptual aid we separate the 
problem into two parts: (1) a consideration of the effect 
of the effective mass anisotropy, and (2) a consideration 
of the effect of the detailed shape of V(r). The procedure 
is: Partition the eigenstates of (1) into sets which be- 
come degenerate as y—1 and find the mean energy En: 
of the 2/++1 eigenstates of a set. Fit the spectrum of the 


E,7s to the eigenspectrum of Hy=£,, a), where 
Hy= (p2+p,?+ p.)/2m(y)+ V(r), (2) 


and thereby determine.an effective isotropic mass m(y). 
It is assumed that the m(y)’s determined from different 
pairs of levels will be nearly the same at least in the 
region of the spectrum that is of interest, and further 
that the value of m(y) is sufficiently independent of 
choices of V(r) for the choices considered. m/(-) is given 
quite generally in the limit y->1 by 3/m(1)=2/m, 
+1/my. 

With V=—e?/kr (the static dielectric coefficient x is 
12 for Si), we find approximate eigenvalues of (1) by 
the variational method using analytic trial functions 
derived from hydrogenic eigenfunctions. For example, 
the hydrogen is eigenfunction is proportional to 
exp — A (x?-++-?++2*)#] and the corresponding trial func- 
tion is proportional to exp[— (Bx?+ B*y?+ D*2")*], 
where B and D are variable parameters chosen to 
minimize the energy. All trial functions are mutually 
orthogonal. Lampert* has calculated the 1s level by this 
method, and we have extended his calculations to ex- 
cited states with the following results for Si: 


2po 2s 2p 3po 3p4 
0.0284 0.0106 0.0071 0.0058 0.0047 0.0026 


Energy level designation 1s 





Binding energy (ev) 


We identify the four observed absorption lines! for As 
in Si as the transitions: 1s—20, 1s—2p4, 1s—3p0, 
1s—3p,. Comparison between theory and experiment 
(Table I) indicates that this identification is consistent 
with the observed lines, that the calculation represents 
the splitting in the mp levels associated with the 
anisotropic effective mass to reasonable accuracy, but 
that the scale of the calculated mean energy levels E,; is 
too small. EH3,—H2, is small by a factor 0.71 while 
E;,—E,, and E.,—E,, are small by factors 0.49 and 
0.46, indicating that the solution is better for excited 
than for ground energies, as might be expected. The 
region near r=0, which is the presumed source of the 
major discrepancies, is less important for the excited 
states because of their larger orbits. 

The small difference between the m(5.2)’s from the 
calculated 1s—2p and 1s—3p transitions (Table I), 
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TaBLE I. Comparison between theory and experiment. AE is the 
energy difference in ev of the designated transition; 5(AZ) is the 
energy difference of adjacent transitions, equivalent to the level 
separation; AZ,; is the mean transition energy. 








Calculated 
AE(ev) 6(AE) 


0.0258 


Experimental 
AE(ev) 6(AE) 


0.0521 
0.0506 





Transition AEB a AE, 
1s—3 ps 
0.0015 0.0516 0.0021 0.0251 


0.0031 


1s—3 po 


1s 2p + 
0.0050 0.0458 
1s—2po 








0.296 and 0.299, supports our assumption of the exist- 
ence of an m(y), while the larger difference between the 
corresponding experimental values, 0.648 and 0.614, 
points to a deviation of V from hydrogenic form. 
Changes in the #,,; from use of a more realistic V than 
—e*/kr are being investigated with a view to improving 
the poor agreement between the experimental and the 
calculated values of E,,, shown in Table I. 

I am pleased to express my appreciation to Dr. B. 
Lax and Dr. H. J. Zeiger of this laboratory for many 
stimulating discussions on this subject, and to Burstein 
et al.| for permission to use their data prior to publi- 
cation. 

* The research in this document was supported jointly by the 
Army, Navy, and Air Force under contract with the Massachusetts 
Institute of Technology. 

1 Burstein, Picus, and Henvis (private communication). 


2 Dexter, Lax, Kip, and Dresselhaus, Phys. Rev. 96, 222 (1954). 
3M. A. Lampert (private communication). 


Theory of Melting and Yield Strengths 


Henry AROESTE 
Guggenheim Jet Propulsion Center, California Institute 
of Technology, Pasadena, California 
(Received January 10, 1955) 


HE theory of melting proposed by Fiirth,! which 
has been related to the rupture strength, has been 
criticized as yielding fortuitous results largely because 
rupture strengths have been associated more definitely 
with surface phenomena.’ It may therefore be of interest 
to try to relate Fiirth’s theory or a modification thereof 
to the yield strength, which is less surface-dependent. 
If we restrict ourselves to single crystals of high 
purity, the Frank-Read mechanism for yielding pre- 
sumably applies. The increase in energy, U, in forming a 
semicircular dislocation of radius R from an edge dis- 


location of width ¢ may be written as 
= —$0R'7b+-U.+- Un, (1) 


where 7+ is the applied shear stress, b is the Burgers 
vector, and U, and U,», represent, respectively, the in- 
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crease in elastic and misfit energy. Making the simple 
assumption that the mixed semicircular dislocation is 
half edge and half screw, one obtains approximately for 
U, and U,,? 

U &[(2n—4—20) Rub?/8e(1—c) ][n(4R/¢)—1], 


U m= (2x—4—210) Ryub?/8x(1—c). 


(2) 
(3) 


Here » and o are respectively the shear modulus and 
Poisson’s ratio of an isotropic crystal. In writing U. we 
have put e, the usual lower limit of integration, ap- 
proximately equal to ¢/2. The general conclusions 
drawn later are not particularly dependent on this 
choice, 

The critical yield stress, 7,, will correspond to the 
value of 7 when 0U/dR=0. Thus we obtain 


7 (k/aRb) Inf (4R/5)+1], (4) 


(5) 


Equations (4) and (5) are derived for an initial edge 
dislocation because it may easily be shown that a screw 
or mixed straight dislocation will give a higher value 
for T.. 

If we take for ¢ the result obtained by Foreman, 
Jaswon, and Wood‘ that 


§=yb/2x(1—c) Tm, (6) 


where 7,, is the theoretical shear strength, and use for 
Tm the lowest value thus far derived,® i.e., about «4/30, 
we obtain 

(7) 


where 
k= (24—4—210)ub?/82(1—c). 


(15b/x(1—c). 


Now, from the theory of Fiirth, which assumes that 
melting is due to the break up of a block structure, we 


may write that 
2RX6bA/0, (8) 


where A is the heat of sublimation and Q is the heat of 
melting. Using Eqs. (7) and (8) in (4), we obtain 


7 &=[(24—4—20)u0/2427(1—o) A] 
XIn{[12"(1—0)A/150]+1}. (9) 


This formula is in principle applicable to unworked 
materials close to the absolute zero. The yield strengths 
of a few metal crystals such as Zn and Cd and of the 
ionic crystal NaCl have been measured at very low 
temperatures. One may also very roughly extrapolate 
the data on other crystals to the absolute zero. The yield 
strengths derived from formula (9) are at least an order 
of magnitude too high. One may lessen the discrepancy 
a bit by using the suggestion of Fisher® that a single- 
ended source near the surface should start to operate at 
one half the stress needed for a double-ended source of 
the same length. 

There are two further ways to achieve lower results. 
One is to assume that the crystal is not homogeneously 
blocked and that there are some much longer blocks 
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which are operative. Another, which is more interesting 
and more likely, is to modify Fiirth’s formula to read, 


Say, 
2R=6(¢/2)A/Q. 


This assumes that the mechanism of melting is tied 
intrinsically not only to the block size, but also to a 
width between blocks, where the atoms are misfit and 
may be expected to enter into the mechanism first. 

The author wishes to thank Professor H. S. Tsien for 
helpful discussion. 


1R. Firth, Phil. Mag. 40, 1227 (1949), and other references 
cited there. 

2 J. Frenkel, Kinetic Theory of Liquids (Oxford University Press, 
London, England, 1946), p. 101. 

3 See for example A. Cottrell, Dislocations and Plastics Flow in 
Crystals (Oxford University Press, London, England, 1953) for 
basic formulas. 

4Foreman, Jaswon, and Wood, Proc. Phys. Soc. (London) 
A64, 156 (1951). 

5 J. Mackenzie, thesis, University of Bristol, 1949 (unpublished). 

6 See reference 3, p. 86. 
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Theory of the Meissner Effect in 
Superconductors 


J. BARDEEN 
University of Illinois, Urbana, Illinois 
(Received January 3, 1955) 


HE general features of the superconducting state 
are now well-established, although a good mathe- 
matical or detailed physical description is lacking. 
Pippard! has shown that the wave functions (range-of- 
order) of the electrons in the superconducting state 
extend over relatively large distances (~10~ cm) and 
that the penetration depth does not vary much with 
magnetic field. The latter implies that a linear theory, in 
which only first-order changes of wave functions pro- 
duced by the magnetic field are included, should be 
satisfactory. As pointed out particularly by Slater, 
wave functions extending over large areas are favorable 
for a large diamagnetism. While it is thought that the 
Meissner effect (B=0) follows rather generally from 
these considerations, it has been difficult to treat a 
specific model. One model, which is a modification of a 
degenerate free-electron gas, is discussed below. 

We assume that in the superconducting state a finite 
energy e~kT, is required to excite electrons from the 
surface of the Fermi sea, and that electrons so excited 
behave much like excited electrons in the normal state. 
This model has been discussed in a qualitative way by 
Welker® and others. An adequate description of the 
“condensed” superconducting state probably requires 
going beyond a one-particle model.‘ 

To avoid introduction of a boundary, we follow the 
method of Klein’ and Schafroth® in which an infinite 
medium is considered and the sources of the magnetic 
field are introduced in the interior. A relation is derived 
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between the Fourier component i(k) of the current 
density of the electrons and the corresponding com- 
ponent A(k) of the vector potential. If the gauge in A is 
chosen so that divA=0, then A and k are perpendicular, 
and i(k) is parallel to A(k). Thus we may write: 


(4x/c)i(k) = —F(R)A(k). (1) 


As pointed out by Klein and by Schafroth, the Meissner 
effect is obtained if F(0)>0. If Ao(k) represents the 
source field, the self-consistent solution is obtained from: 


[#?+F (k) JA(k) = #A0(k). (2) 


If one assumes with London that the wave functions 
are not modified at all by the magnetic field, 


F(k) =o ?=41e’n/ me’, (3) 


where Apo is the penetration depth and 1 is the electron 
concentration. Actually, one should include first-order 
perturbation changes resulting from the field. We as- 
sume that the energies of excited states and matrix 
elements are similar to those of a normal degenerate 
electron gas, except for the additional energy ¢ required 
for each excited electron. A modification of Klein’s 
treatment then gives 


\WR(A)=1—— 
el 2. 


1 kout+3k+me/hk 
xf u(1—w?) os( Jaw (4) 
0 | kow—4k| +me/h?k 


where ko is the magnitude of the wave vector of the 
Fermi surface. When e=0, this gives the usual Landau 
diamagnetism. When e>0, F(k)—Ac as k-0. For k 
such that Aok~1 and me/h?k?~1, a good approximation 
to F(R) is 


3 
APF (k)=- 





(5) 


me h?kko 
log( 1+ ). 
2 h*kko 


This expression (5) is valid over the range of & which 
makes an appreciable contribution to the field for 
normal penetration phenomena. The penetration depth 
is obtained from the integral 


A -[ dk 
dy R+F(E) 


me 


(6) 


which may be evaluated approximately by replacing k 
in the logarithm by an average value ~\o"!. We then 


find: 
3 meXo hky \ T* 
n= 0.7704 - log( 1+ )] , 
2 hho meXo 


(7) 


With e~5X 10-6 ergs, ko~10® cm, Ao~10-* cm, we 
find X~2A 9. Since the logarithm is slowly varying, 
varies approximately as (E rkT.)~ or as n~*/%e-4. 
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The nature of the excited states in actual super- 
conductors is indicated by the temperature variation of 
the specific heat, thermal conduction, and electrical 
conduction (observed in the skin depth at microwave 
frequencies). These all indicate a density of “normal” 
electrons in excited states, such as would follow from our 
model. The energy ¢ undoubtedly depends on tempera- 
ture and goes to zero at the transition point. Semi- 
empirical expressions for free energy and critical field 
derived from a model of this sort are in good agreement 
with experiment.’ There is less justification for assuming 
that the matrix elements of the magnetic interaction are 
unchanged by the transition, but one would not expect a 
change in matrix elements to alter the results in a 
drastic way. Thus any model which gives correctly the 
thermodynamical properties of the superconducting 
state will most likely give the Meissner effect. 

1A. B. Pippard, Proc. Roy. Soc. (London) A203, 98 (1950). 

2J. C. Slater, Phys. Rev. 51, 195 (1937); 52, 214 (1937). 

3H. Welker, Z. Physik 114, 525 (1939). 

4 A completely filled band yields only a small diamagnetism even 
when the energy gap is small. The author [Phys. Rev. 81, 829 
(1951) ] has proposed a one-particle model in which the electrical 
properties can be described by a small number of particles with 
very small effective mass. Klein’s method as applied to this model 
gives a large but not perfect diamagnetism. As pointed out by H. 
Frohlich [Nature 168, 280 (1951)], there is a small but finite 
residual field in the interior of a massive specimen. Although such 
a field penetration is not ruled out by experiments, it seems 
unlikely to occur. This is probably as close as one can come to the 
Meissner effect using a purely individual particle description. The 
assumption of a “condensed state” goes beyond such a description. 

5Q. Klein, Arkiv. Mat., Astron. Fysik 31A, No. 12 (1944). 

6M. R. Schafroth, Helv. Phys. Acta 24, 645 (1951). 

7™W. L. Ginsburg, J. Exptl. Theoret. Phys. (U.S.S.R.) 14, 134 
(1946), Fortschr. Physik 1, 101 (1953). 


Energy Distribution of Protons Due to 
Collision Energy Loss 


K. C. Hines 
Physics Department, University of Melbourne, Melbourne, Australia* 
(Received July 19, 1954; revised manuscript received 
January 28, 1955) 


XPERIMENTS have shown that the Landau 
distribution is too narrow for electrons (e.g., 
Rothwell,! West,? Birkhoff*). Blunck and Leisegang,‘ 
Blunck® and others, using essentially the same method 
as Landau,® have attempted to improve the calculated 
distribution by inclusion of the so-called second moment 
term which they obtained in an approximate form by 
using the semiclassical Bohr treatment of collision loss. 
Apart from this improvement, however, their treatment 
suffers from the same defects as the Landau theory. 
Recently Fano’ has given a complete formulation of 
the general problem of the passage of charged particles 
through layers of material thick enough to contain the 
whole range of the particles, although the only solution 
quoted which is specific to’the Landau problem (thin 
layers) is that of Landau himself. Fano states two of the 
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errors in the Landau treatment, viz., use of approximate 
probability distributions for energy loss and extension 
of the upper limit of integration over ¢ (energy loss) to 
infinity. It is the purpose of the present note to indicate 
an accurate method of solving the problem and to apply 
this method to the energy distribution of protons. 
The transport equation is 


Eo—E 


dg (x,E) 
-f g(x, E+e)w(E+e, e)de 


Ox 
f(x Bw(E,ode, (1) 


€min 


where g(x,£)dE is the number of particles in (E,dE), Eo 
is the initial energy, and émin, €max are, respectively, the 
minimum and maximum amounts of energy which can 
be transferred in a single collision. If the layer of 
absorbing material is thin enough, it is justifiable to 
regard the function w(H,e)—the probability per unit 
length of an energy loss e—as independent of the 
particle energy E. 

Instead of the Laplace transform used by Landau we 
apply the Mellin transform to define a transform func- 
tion G(x,s) satisfying 


dG(x,s) 


be Eo-Lt+e e\* 
-f at f L-(1-=) 
0 €min L 


x<e(2,L)w(dde—G(x,9) f ” ite 


€min 


with L=E+e. 

The range of integration over « is split up by choosing 
€: such that the first three terms of the binomial ex- 
pansion are adequate for all values of « between émin 
and «:: 


dG(x,s) 


re =—6(a9) [~ w(od 


aetna f eolidile 


—1 —_ €1 
fen il 


émin 


+ f "aL J (1-2) getrw(ode (3) 


The first three terms on the right hand side of (3) 
provide a Gaussian approximation to g(x,Z) with 
roughly the correct half-width. The third (second 
moment) term must be worked out by an accurate 
quantum mechanical method, analogous to the Bethe 
stopping-power calculation for the first moment. 


ew(e)de 
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For protons, in general, Ey>—L>émax and the upper 
limit of integration in the last term of (3) must be taken 
aS €max and not Eo— L. It has been found essential to use 
the correct upper limit here and this fact indicates a 
serious defect in the Landau treatment since the latter 
cannot be adapted for use with the correct upper limit. 

The last term in (3) may be evaluated by expanding 
out the binomial series (including 4 terms of the ex- 
pansion), inserting the expression for w(e) (see Bethe), 
and carrying out the integrations. After combining the 
result with the other terms of (3), one obtains 


0G(x,s) 
= —B(s—1)G(x, s—1) 
Ox 


—1)(s—2 
$e Mo s—2) 


(s—1)(s—2)(s—3) 
a G(x, s—3), 
3X2X2 


(4) 





in which 


e- f oe f bei 


€min €min 


2mv | | 
Eh 
1—2/¢e 3c 


A good approximate solution (correct to terms in 2’) 
for Eq. (4) is 


G(x,s) = (Eo—xB)** exp[ar(s)x+-a2(s)2*], 





2rNetZ | 


mv 


(6) 


in which 
(s—1)(s—2) 
Y 


(s—1)(s—2)(s—3) 
a,(s) =4 | mes 
2E¢ 


3X2X2E?° 


(7) 





and d»(s) gives small correction terms: 
B(s—1) 
a 
2Eo 
(s—1)(s—2)(s—3) 
3X2X2E 
(s—2)(s—3)(s—4) 
3X2X2Ee 
wie 4(s—1)(s— 2) (s—3) (s—4) 4 re 1)(s—2) 
ii 2X4Ey! 4E? 
i 38n(s— 1) (s—2)(s—3) és By(s—1)(s—2) 
3X2X4Eo! 4Ee 


(s— 1)*(s—2)? 
4E 
v(s—1)(s—2) 
x ean 
2E? 
2 v(s—2)(s—3) 
2E¢ 
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LETTERS TO 


The solution (6) is inserted into the inversion formula 
for the Mellin transform to yield a relation for the re- 
quired distribution function: 


1 piets 


g(x,E)=— 
2m J— ints 


exp[ —s logE+ (s—1)log(Eo— x8) 
+a1(s)x-+-a2(s)a?]ds. (8) 


An experimental number versus energy plot for protons 
in Al, taken from the work of Reynolds et al.,!° is shown 
in Fig. 1. In order to compare this with results obtained 
from the numerical evaluation of (8), one plots the 
quantities N(x,Z) defined by 


g(x,E)dE. (9) 


The integral distrihution obtained in this way from 
the present calculations is seen to fit the experimental 
result very closely. The agreement is even better when 
one notes the initial energy spread of the incident 
particles shown on the right of Fig. 1. The slight 
asymmetry of the experimental curve is present also in 
the theoretical result. 











405 7400 E(mev) 4° 

Fic. 1. Curves for the energy distribution of 0.4263-Mev protons 
after passing through 3.795 X 10-5 g/cm? of Al. NV (E) is the number 
of particles with energy greater than £, normalized to 1 incident 
particle. Curve I gives experimental results from the work of 
Reynolds et al. Curve II has been calculated on the basis of the 
present theory. Curve III has been calculated on the basis of the 
Landau theory. Curve IV shows the initial energy spread of the 
protons of curve I. 


The distribution obtained from Landau’s theory is 
also shown in Fig. 1. In view of the important omissions 
in this treatment it is not surprising that the Landau 
curve is unsatisfactory. It is worth noting that for 
electrons the Landau theory gives a distribution which 
is too narrow, whereas for protons the distribution is 
much too wide. 
| The author is indebted to Dr. D. N. F. Dunbar for 

making the experimental results available prior to 
publication. 

* This work was carried out under a grant from Melbourne 
University. 
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Errata 








Search for 15-Mev Gamma Radiation from 
N'*+d and Be’+e, V. K. RAsMussEN, JOHN R. 
Rees, M. B. Sampson, AND N. S. WALL [Phys. 
Rev. 96, 812 (1954) ]. The subscripts a and y were 
interchanged at the bottom of the first column of 
page 813. The statement as to the relative prob- 
ability of a and y decay should read “TI, is cer- 
tainly less than 100 I’, and is probably less than 
10 Ty.” 


Recombination Processes in Insulators and 
Semiconductors, ALBERT ROsE [Phys. Rev. 97, 
322 (1955) ]. In item 5 of the section labeled 
“Summary” on page 333, read “usually” instead of 
“always.” 


Average Number of Neutrons Emitted During 
the Spontaneous Fission of Cf?*?, W. W. T. CRANE, 
G. H. Hiceins, AnD S. G. THompson [Phys. Rev. 
97, 242 (1955) ]. The first sentence should read 
“The average number of neutrons per spontaneous 
fission of Cf*5? has been found to be 3.53+0.15 ---” 
instead of ‘‘The average --+ has been found to be 
3.10+0.15 ---.” 


Origin of Nitrous Oxide in the Atmosphere, 
P. HARTECK AND S. DonpEs [Phys. Rev. 95, 320 
(1954) ]. It has come to the attention of the authors 
that Adel! (the discoverer of nitrous oxide in the 
atmosphere) has also discussed the origin of nitrous 
oxide in the atmosphere. Bates and Witherspoon? 
have theoretically examined the photochemistry of 
the constituents of the atmosphere. The results of 
Bates and Witherspoon quantitatively reflect the 
idea of photochemical processes causing the pres- 
ence of nitrous oxide in the atmosphere, similar 
to our own views, and question the adequate 
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formation of nitrous oxide by soil micro-organisms, 
advocated by Adel. Obviously further biological 
quantitative data are required. From a report by 
Tousey* of the presence of x-rays in the upper 
atmosphere, it may be possible that x-ray irradia- 
tion may also cause the formation of nitrous oxide. 


PHYSICAL REVIEW 


VOLUME 97, 


VOLUME 97 


We are presently also investigating this possibility 
for the nitrous oxide formation. 


aon) Adel, Science 103, 280 (1946); A. Adel, Science 113, 624 
1). 

2D. R. Bates and A. E. Witherspoon, Monthly Notices Roy. 
Astron. Soc. 112, No. 1, 101 (1952). 

3 R. Tousey, J. Opt. Soc. Am. 43, 245 (1953). 
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Spectroscopy Technique (see Methods and Instruments) 
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Standards (see Constants, Standards, Units) 
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Metastable states of finite lattice gas, Arnold J. F. Siegert 
—1456 
Pressure-volume isotherms of He‘ below 4.2°K, William E. 
Keller—1 
Principle of detailed balance, Martin J. Klein—1446 
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Melvin Lax—1419 
Second virial coefficients of He from the exp-six potential, 
John E. Kilpatrick, William E. Keller, and Edward F. 
Hammel—9 
Statistical treatment of weakly interacting particles in 
Newtonian potential, Robert W. Hart and William H. 
Guier—841 
Thermodynamic properties of He* and He‘ solutions, V. S. 
Nanda—S571 
Thickness of He film as function of height, Lothar Meyer— 
22 
Variational principles in irreversible thermodynamics with 
application to viscoelasticity, M. A. Biot—1463 
Statistical Methods (see Mathematical Methods) 
Superconductivity ; 
Empirical relation between superconductivity and number 
of valence electrons per atom, B. T. Matthias—74 
Neutron diffraction observations on superconducting state, 
M. K. Wilkinson, C. G. Shull, L. D. Roberts, and S. 
Bernstein—889 
Paramagnetic effect in superconductors. I. 
aspects, Hans Meissner—1627 
Superconducting transitions in Sn whiskers, O. S. Lutes and 
E. Maxwell—i718(L) 
Superconductivity of U, John E. Kilpatrick, Edward F. 
Hammel, and Dillon Mapother—1634 
Theory of Meissner effect in superconductors, J. Bardeen— 
(1724L) 
Ultrasonic attenuation due to lattice-electron interaction in 
normal conducting metals, W. P. Mason—557 (L) 
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Ultrasonic attenuation in metals by electron relaxation, 
R. W. Morse—1716(L) 
Supersonics (see Fluid Dynamics) 


Thermal Conductivity (see Thermal Properties) 
Thermal Diffusion (see Diffusion) 
Thermal Expansion (see Thermal Properties) 
Thermal Properties 
Thermal conductivity of In-Tl alloys at low temperatures, 
Ronald J. Sladek—902 
Vibration spectra and specific heats of cubic metals. I. 
Theory and application to sodium, A. B. Bhatia—363 
Vibrational anharmonicity and lattice thermal properties. 
II, D. K. C. MacDonald and S. K. Roy—673 ; ; 
Thermal Radiation (see Radiation) “7 ia 
Thermionic Emission (see Electrical Properties) 
Thermodynamics (see Statistical Mechanics and Thermody- 
namics) 
Thermoelectric Effect (see Electrical Properties; Semicon- 
ductors) 
Thermoluminescence (see Luminescence) 
Thermomagnetic Effect (see Magnetic Properties) 
Total Cross Sections (see Electrons and Positrons; Nuclear 
Reactions) 
Transmutation (see Nuclear Reactions) 


Uncertainty Principle (see Quantum Mechanics) 
Units (see Constants, Standards, Units) 


Vacuum Tubes (see Methods and Instruments) 
Van der Waals Forces (see Molecular Structure and Spectra) 
Viscosity (see Liquids) 


Wave Mechanics (see Quantum Mechanics) 
Work Function (see Electrical Properties) 


X-Rays 

Elastic spectrum of Cu from temperature-diffuse scattering 
of x-rays, E. H. Jacobsen—654 

Energies of the K transitions of x~-mesonic x-rays, M. 
Stearns, M. B. Stearns, S. DeBenedetti, and L. Leipuner— 
240(L) 

New treatment of Auger effect and fluorescence yield in 
lighter elements, R. A. Rubenstein and J. N. Snyder— 
1653 

Soft x-ray absorption of evaporated thin films of Te, Robert 
W. Woodruff and M. Parker Givens—52 

X-ray measurements of pile-irradiated LiF, D. T. Keating 
—832(L) 

X-ray spectroscopy of solid state: KCI, L. G. Parratt and 
E. L. Jossem—916 
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